Our ability to identify or recognize visual objects is often enhanced by evidence provided by other sensory modalities. Yet, where and how visual object processing benefit from the information received by the other senses remains unclear. One candidate region is the temporal lobe, which features neural representations of visual objects, and in which previous studies have provided evidence for multisensory influences on neural responses. In the present study we directly tested whether visual representations in the lower bank of the superior temporal sulcus (STS) benefi t from acoustic information. To this end, we recorded neural responses in alert monkeys passively watching audio-visual scenes, and quantified the impact of simultaneously presented sounds on responses elicited by the presentation of naturalistic visual scenes. Using methods of stimulus decoding and information theory, we then asked whether the responses of STS neurons become more reliable and informative in multisensory contexts. Our results demonstrate that STS neurons are indeed sensitive to the modality composition of the sensory stimulus. Importantly, information provided by STS neurons' responses about the particular visual stimulus being presented was highest during congruent audio-visual and unimodal visual stimulation, but was reduced during incongruent bimodal stimulation. Together, these findings demonstrate that higher visual representations in the STS not only convey information about the visual input but also depend on the acoustic context of a visual scene.
ASJC Scopus subject areas