Depending on what areas of the internet you frequent, perhaps you were under the illusion that thoughts-to-text technology already existed; we all have that one mutual or online friend that we gently hope will perhaps one day post slightly less. Well, recently Meta has announced that a number of their research projects are coming together to form something that might even improve real people’s lives—one day. Maybe!
Way back in 2017, Meta (at that time just called ‘Facebook’) talked a big game about “typing by brain.” Fast forward to now and Meta has shared news of two breakthroughs that make those earlier claims seem more substantial than a big sci-fi thought bubble (via MIT Technology Review). Firstly, Meta announced research that has created an AI model which “successfully decodes the production of sentences from non-invasive brain recordings, accurately decoding up to 80% of characters, and thus often reconstructing full sentences solely from brain signals.”
The second study Meta shared then examines how AI can facilitate a better understanding of how our brains slot the Lego bricks of language into place. For people who have lost the ability to speak after traumatic brain injuries, or who otherwise have complex communication needs, all of this scientific research could be genuinely life-changing. Unfortunately, this is where I burst the bubble: the ‘non-invasive’ device Meta used to record brain signals so that they could be decoded into text is huge, costs $2 million, and makes you look a bit like Megamind.
Dated reference to an animated superhero flick for children aside, Meta has been all about brain-computer interfaces for years. More recently they’ve even demonstrated a welcome amount of caution when it comes to the intersection of hard and ‘wet’ ware.
This time, the Meta Fundamental Artificial Intelligence Research (FAIR) lab collaborated with the Basque Center on Cognition, Brain and Language, to record the brain signals of 35 healthy volunteers as they typed. Those brain signals were recorded using the aforementioned, hefty headgear—specifically a MEG scanner—and then interpreted by a purposefully trained deep neural network.
Meta wrote, “On new sentences, our AI model decodes up to 80% of the characters typed by the participants recorded with MEG, at least twice better than what can be obtained with the classic EEG system.”
This essentially means that recording the magnetic fields produced by the electrical currents within the participants’ brains resulted in data the AI could more accurately interpret, compared to just recording the electrical activity itself via an EEG. However, by Meta’s own admission, this does not leave the research in the most practical of places.
For one, MEG scanners are far from helmets you can just pop on and off—it’s specialised equipment that requires patients to sit still in a shielded room. Besides that, this study used a comparatively tiny sample size of participants, none of whom had a known traumatic brain injury or speech difficulties. This means that it’s yet to be seen just how well Meta’s AI model can interpret for those who really need it.
Still, as a drop out linguist myself, I’m intrigued by Meta’s findings when it comes to how we string sentences together in the first place. Meta begins by explaining, “Studying the brain during speech has always proved extremely challenging for neuroscience, in part because of a simple technical problem: moving the mouth and tongue heavily corrupts neuroimaging signals.” In light of this practical reality, typing instead of speaking is kind of genius.
So, what did Meta find? It’s exactly like I said before: Linguistic Lego bricks, baby. Okay, that’s an oversimplification, so I’ll quote Meta directly once more: “Our study shows that the brain generates a sequence of representations that start from the most abstract level of representations—the meaning of a sentence—and progressively transform them into a myriad of actions, such as the actual finger movement on the keyboard […] Our results show that the brain uses a ‘dynamic neural code’—a special neural mechanism that chains successive representations while maintaining each of them over long time periods.”
To put it another way, your brain starts with vibes, unearths meaning, daisy chains those Lego bricks together, then transforms the thought into the action of typing…yeah, I would love to see the AI try to interpret the magnetic fields that led to that sentence too.
Best gaming PC: The top pre-built machines.
Best gaming laptop: Great devices for mobile gaming.