For my auditory systems class (with Steffen Hage), I had to write an essay on the question "How do single neurons in the auditory cortex differentiate self-produced vocalizations from heard vocalizations". I quite liked writing the essay, I hope you enjoy reading it. :)
Introduction
When two (or more) individuals engage in interactive vocal communication, the brain faces a number of challenges. One such challenge is how it can correctly determine that the perceived sounds are indeed self-produced, rather than coming from an external source (i.e. the other speaker). Marmosets (Callithrix jacchus) provide a great model organism to study this question, as i) unlike humans they permit the use of invasive electrophysiological methods, and ii) unlike other non-human primates, they convergently evolved the ability for very rich social-vocal interactions. For example, they engage in antiphonal calling, where marmosets reciprocally exchange “Phee” calls and attempt not to overlap their vocalizations in time (Eliades & Miller, 2017). Self-other vocalization differentiation (SOVD) has behavioural relevance for self-monitoring of one’s vocal production. For example, during vocal turn taking, marmosets may need to interrupt their Phee call in response to other’s vocalizations beginning (demonstrated with noise paradigm in Fig. 1A). Self-monitoring may also involve correcting the pitch (frequency) of one’s voice (see Fig. 2B).
There are two main ways auditory cortex (AC) neurons could achieve SOVD. These are not mutually exclusive. 1) The AC has an internal model of the differential acoustic features of self vs other-produced vocalizations and AC neurons can classify this difference based of ascending auditory inputs. 2) Before the initiation of a vocalization, the AC receives a forward model or efference copy about the timing of the vocalization, thus allowing the AC to compare the timing of incoming external sounds with the forward model, allowing the AC to compute a sensory prediction error (Eliades & Wang, 2019). In a variation of 2), the forward model not only includes the timing but also acoustic features of the vocalization. The following essay will mainly evaluate evidence for the second theory.
Results
AC neurons are suppressed during a Phee vocalization (Fig 2. A-C). During frequency-shifted feedback (like in Fig. 1B), the suppression of neural activity is reduced (Fig. 2B-C red). This is not the case for amplified feedback (Fig 2B-C. black). Eliades & Wang (2003, 2013) classified AC neurons into the two populations of “suppressed neurons” (65-90%) and “excited neurons” (10-35%). During frequency-shifted feedback, suppressed neurons showed a higher increase in firing rate than excited neurons (Fig. 2D-E). This was quantified by comparing the response modulation index (RMI) for altered feedback vs baseline, where RMI measures the relative firing rate change from before to during the vocalization (Fig. 2F). This increased RMI difference for suppressed neurons may suggest that suppression of neural activity facilitates better self-monitoring of vocalizations – see behavioural context in Fig. 1 and Eliades & Tsunada (2018).
The question remains what drives the suppression of AC neural activity and how does this relate to SOVD. It’s interesting to note that although AC activity is suppressed during vocalization, gamma-band oscillations (>25 Hz) are increased (Tsunada & Eliades, 2020). Importantly, this increase in gamma oscillations was much larger in conditions where marmosets vocalized (“induced” response), not when they passively listened to a playback of their vocalization (“evoked” response). Furthermore, they found that the “induced” component of the gamma response was strongly correlated with pre-vocal gamma activity, thus suggesting that neural activity before vocalization onset drives AC suppression.
An efference copy signal before vocalization onset could drive AC suppression. The question is where does this signal come from. Eliades & Wang (2019) initially proposed the cerebellum or basal ganglia as possible regions for computing the forward model. Yet recent research points towards the volitional articulatory motor network (VAMN) as a good candidate (Fig. 3).
Studies in the ventral premotor cortex (PMv) of marmosets (Roy et al., 2016) and macaques (Gavrilov & Nieder, 2021; Hage, 2018) showed neuronal suppression before vocalization onset. Importantly, in control trials where the monkeys also engaged in behaviours such as orofacial movements or cued hand movements, neuronal suppression was less prominent, suggesting the effect is specific to vocalization.
Although no study so far has recorded simultaneously in AC and VAMN to prove this conclusively, the studies above suggest that neuronal suppression in VAMN precedes that of AC. Yet the origin or directionality of the efference copy signal within the VAMN is less clear. Recording in PMv, the inferior arcuate sulcus (ASi or BA44), and the ventral pre-arcuate region (VPA or BA45), Gavrilov & Nieder (2021) did not find significant latencies of neuronal suppression across regions.
Discussion
The suppression of neural activity in AC during vocalizations seems to suggest that AC can do SOVD. One possibility is that AC neurons can anticipate a vocalization due to an efference copy from another region. A promising candidate is the VAMN, since it also shows neuronal suppression before vocal onset. Yet other regions such as the cerebellum are also plausible and should be studied further. Considering gamma oscillations are increased during and before vocalization/neuronal suppression, it may also be fruitful to look for gamma oscillations in the VAMN.
So far, this discussion has only considered how the efference copy may carry a signal about the timing of self-produced vocalizations. However, the VAMN is close to other frontal regions and via the arcuate fasciculus is connected to the superior temporal gyrus. Thus, it is plausible that this forward model also carries rich information about self-produced vocalizations such as characteristic acoustic features or even semantic aspects, that may help SOVD. This may be even more true in humans.
Bibliography
Eliades, S. J., & Miller, C. T. (2017). Marmoset vocal communication: Behavior and neurobiology. Developmental Neurobiology, 77(3), 286–299. https://doi.org/10.1002/dneu.22464
Eliades, S. J., & Tsunada, J. (2018). Auditory cortical activity drives feedback-dependent vocal control in marmosets. Nature Communications, 9(1), Article 1. https://doi.org/10.1038/s41467-018-04961-8
Eliades, S. J., & Tsunada, J. (2023). Effects of Cortical Stimulation on Feedback-Dependent Vocal Control in Non-Human Primates. The Laryngoscope, 133(S2), S1–S10. https://doi.org/10.1002/lary.30175
Eliades, S. J., & Wang, X. (2003). Sensory-Motor Interaction in the Primate Auditory Cortex During Self-Initiated Vocalizations. Journal of Neurophysiology, 89(4), 2194–2207. https://doi.org/10.1152/jn.00627.2002
Eliades, S. J., & Wang, X. (2008). Neural substrates of vocalization feedback monitoring in primate auditory cortex. Nature, 453(7198), Article 7198. https://doi.org/10.1038/nature06910
Eliades, S. J., & Wang, X. (2013). Comparison of auditory-vocal interactions across multiple types of vocalizations in marmoset auditory cortex. Journal of Neurophysiology, 109(6), 1638–1657. https://doi.org/10.1152/jn.00698.2012
Eliades, S. J., & Wang, X. (2019). Corollary Discharge Mechanisms During Vocal Production in Marmoset Monkeys. Biological Psychiatry: Cognitive Neuroscience and Neuroimaging, 4(9), 805–812. https://doi.org/10.1016/j.bpsc.2019.06.008
Gavrilov, N., & Nieder, A. (2021). Distinct neural networks for the volitional control of vocal and manual actions in the monkey homologue of Broca’s area. eLife, 10, e62797. https://doi.org/10.7554/eLife.62797
Hage, S. R. (2018). Auditory and audio-vocal responses of single neurons in the monkey ventral premotor cortex. Hearing Research, 366, 82–89. https://doi.org/10.1016/j.heares.2018.03.019
Hage, S. R., & Nieder, A. (2016). Dual neural network model for the evolution of speech and language. Trends in Neurosciences, 39(12), Article 12.
Pomberger, T., Risueno-Segovia, C., Löschner, J., & Hage, S. R. (2018). Precise Motor Control Enables Rapid Flexibility in Vocal Behavior of Marmoset Monkeys. Current Biology, 28(5), 788-794.e3. https://doi.org/10.1016/j.cub.2018.01.070
Roy, S., Zhao, L., & Wang, X. (2016). Distinct Neural Activities in Premotor Cortex during Natural Vocal Behaviors in a New World Primate, the Common Marmoset ( Callithrix jacchus ). The Journal of Neuroscience, 36(48), 12168–12179. https://doi.org/10.1523/JNEUROSCI.1646-16.2016
Tsunada, J., & Eliades, S. J. (2020). Dissociation of Unit Activity and Gamma Oscillations during Vocalization in Primate Auditory Cortex. Journal of Neuroscience, 40(21), 4158–4171. https://doi.org/10.1523/JNEUROSCI.2749-19.2020
Comments