This article was produced as part of our “My research project in 800 words” initiative.

Understanding neural mechanisms of human brain cognition is a leading topic in neuroscience research. By combining brain imaging techniques and deep neural networks, it’s possible to build a real-time brain decoder that can analyze a short series of brain images and predict the underlying cognitive process. Such research provides fundamental guidance for the development of artificial intelligence and could help bridge the human-machine divide.

Have you ever wished you could read others’ thoughts or even understand their deepest feelings or intentions? Science is also looking at this question. The main goal of my project is to “read” people’s minds in real time using modern brain imaging and advanced artificial intelligence techniques.

Artificial intelligence aims to imitate human intelligence. However, our understanding of human cognition is still limited. Modern imaging techniques such as functional magnetic resonance imaging (fMRI) is a powerful, non-invasive way to map human cognitive function. For instance, researchers can use fMRI brain images to decode a person’s mind or predict their thoughts and emotions. Advanced techniques from machine learning have been used to analyze these complex patterns of brain activity, for instance, and predict the corresponding cognitive state. This technique is called brain decoding or mind reading. Great advances have been made in the field of brain decoding since one decade ago, moving from static images, e.g. recognizing someone’s face or house, to more natural stimuli including reconstructing frames of movies or even visualizing someone’s dreams or imaginations.

In the mind-reading project, I proposed an end-to-end solution for real-time brain decoding which used deep artificial neural networks to analyze a short series of fMRI scans taken while a participant was performing specific cognitive tasks. The brain decoding model was able to detect whether the participant was moving their hands, solving scientific problems, listening to a story, watching a funny video, handling social relationships, or experiencing specific emotions. My decoding model was tested using a large task-fMRI database acquired from the Human Connectome Project (HCP). The database included functional brain imaging of 1200 healthy subjects taken while they performed over 20 different cognitive tasks. By using 10 seconds of fMRI scans, the model differentiated between the 20 cognitive states with 90% decoding accuracy.

However, there are still big gaps between neuroscience research and building real-world mind-reading products. One of the biggest challenges is being able to generalize mind reading across large populations from different societies, cultures, and ethnic groups, for a wide variety of cognitive tasks. So far, the majority of brain decoding research is still carried out on a small group of people undertaking a handful of experimental tasks. For instance, researchers were able to determine whether the participant was looking at a human or an animal face, based on the functional brain images from 7 to 10 participants. To overcome this challenge, I used advanced deep learning tools, including deep convolutional neural networks and graph neural networks, to build a brain decoding algorithm that can be generalized, or adapted to individual and population variability among thousands of subjects, and used for a variety of cognitive tasks. The model provides an end-to-end solution by handling all task conditions at the same time, ranging from recognizing an image, listening to a story, performing different types of body movements, or experiencing different social and emotional states. It requires no prior domain knowledge, such as activating the primary and secondary visual cortex during a visual task. This feature makes it a promising candidate for transfer learning between neuroimaging datasets, especially since a common problem for deep learning on functional brain images is the lack of sufficiently large datasets to train complex models. In another follow-up study, I demonstrated a significant boost in the decoding performance for a small group (12 subjects) after transferring deep learning results from a large population (1000 subjects), regardless of the domain used to train the base model.

The temporal resolution of brain decoding is another big hurdle for mind-reading products that usually requires decoding brain activity in real or nearly real time (milliseconds). The real-time brain-machine interface is one example, e.g. controlling prosthetic limbs with the mind after paralysis due to brain injury or stroke. To tackle this problem, my decoding model worked on the time series of brain signals instead of analyzing the classical spatial patterns of brain activations. Using this framework, the model was able to classify five types of body movements with 95% accuracy by using 720 milliseconds of brain scans. Moreover, the model not only improved the temporal resolution of brain decoding in fMRI but also provides a framework that can be used for other brain imaging modalities, for instance, electroencephalogram and magnetoencephalography. This model is a promising candidate for developing brain-machine interface devices that can translate thought into action in real time, including controlling robotic arms, generating speech and text, creating art and music, and evaluating emotions.

My decoding model’s key features–generalizability and temporal resolution–make it useful for further research in related fields, paving the way for a better understanding of complex and sequential cognitive functions. More exciting projects can be built using this architecture to improve our daily lives, such as by helping with diagnosis and postoperative predictions for patients with neurological and psychiatric disorders.