Summary: Researchers discovered the brain decodes speech differently in noisy environments depending on the speech’s volume and our focus on it.
Their study, leveraging neural recordings and computer models, demonstrates that when struggling to follow a conversation amidst louder voices, our brain encodes phonetic information distinctly from when the voice is easily heard. This could be pivotal in enhancing hearing aids that isolate attended speech.
This research could bring about significant improvements in auditory attention-decoding systems, particularly for brain-controlled hearing aids.
Key Facts:
- The study revealed that our brain encodes phonetic information differently in noisy situations, depending on the volume of the speech we’re focusing on and our level of attention to it.
- The researchers used neural recordings to generate predictive models of brain activity, demonstrating that “glimpsed” and “masked” phonetic information are encoded separately in our brain.
- This discovery could lead to significant advancements in hearing aid technology, specifically in improving auditory attention-decoding systems for brain-controlled hearing aids.
Source: PLOS
Researchers led by Dr. Nima Mesgarani at Columbia University, US, report that the brain treats speech in a crowded room differently depending on how easy it is to hear, and whether we are focusing on it.
Publishing June 6th in the open access journal PLOS Biology, the study uses a combination of neural recordings and computer modeling to show that when we follow speech that is being drowned out by louder voices, phonetic information is encoded differently than in the opposite situation.
The findings could help improve hearing aids that work by isolating attended speech.
Focusing on speech in a crowded room can be difficult, especially when other voices are louder. However, amplifying all sounds equally does little to improve the ability to isolate these hard-to-hear voices, and hearing aids that try to only amplify attended speech are still too inaccurate for practical use.
In order to gain a better understanding of how speech is processed in these situations, the researchers at Columbia University recorded neural activity from electrodes implanted in the brains of people with epilepsy as they underwent brain surgery. The patients were asked to attend to a single voice, which was sometimes louder than another voice (“glimpsed”) and sometimes quieter (“masked”).
The researchers used the neural recordings to generate predictive models of brain activity. The models showed that phonetic information of “glimpsed” speech was encoded in both primary and secondary auditory cortex of the brain, and that encoding of the attended speech was enhanced in the secondary cortex.
In contrast, phonetic information of “masked” speech was only encoded if it was the attended voice. Lastly, speech encoding occurred later for “masked” speech than for “glimpsed’ speech. Because “glimpsed” and “masked” phonetic information appear to be encoded separately, focusing on deciphering only the “masked” portion of attended speech could lead to improved auditory attention-decoding systems for brain-controlled hearing aids.
Vinay Raghavan, the lead author of the study, says, “When listening to someone in a noisy place, your brain recovers what you missed when the background noise is too loud. Your brain can also catch bits of speech you aren’t focused on, but only when the person you’re listening to is quiet in comparison.”
About this auditory neuroscience research news
Author: Nima Mesgarani
Source: PLOS
Contact: Nima Mesgarani – PLOS
Image: The image is credited to Neuroscience News
Original Research: Open access.
“Distinct neural encoding of glimpsed and masked speech in multitalker situations” by Nima Mesgarani et al. PLOS Biology
Abstract
Distinct neural encoding of glimpsed and masked speech in multitalker situations
Humans can easily tune in to one talker in a multitalker environment while still picking up bits of background speech; however, it remains unclear how we perceive speech that is masked and to what degree non-target speech is processed.
Some models suggest that perception can be achieved through glimpses, which are spectrotemporal regions where a talker has more energy than the background. Other models, however, require the recovery of the masked regions.
To clarify this issue, we directly recorded from primary and non-primary auditory cortex (AC) in neurosurgical patients as they attended to one talker in multitalker speech and trained temporal response function models to predict high-gamma neural activity from glimpsed and masked stimulus features.
We found that glimpsed speech is encoded at the level of phonetic features for target and non-target talkers, with enhanced encoding of target speech in non-primary AC. In contrast, encoding of masked phonetic features was found only for the target, with a greater response latency and distinct anatomical organization compared to glimpsed phonetic features.
These findings suggest separate mechanisms for encoding glimpsed and masked speech and provide neural evidence for the glimpsing model of speech perception.