A contrastive-learning approach for auditory attention detection
Abstract: Carrying conversations in multi-sound environments is one of the more challenging tasks, since the sounds overlap across time and frequency making it difficult to understand a single sound source. One proposed approach to help isolate an attended speech source is through decoding the electroencephalogram (EEG) and identifying the attended audio source using statistical or machine learning techniques. However, the limited amount of data in comparison to other machine learning problems and the distributional shift between different EEG recordings emphasizes the need for a self supervised approach that works with limited data to achieve a more robust solution. In this paper, we propose a method based on self supervised learning to minimize the difference between the latent representations of an attended speech signal and the corresponding EEG signal. This network is further finetuned for the auditory attention classification task. We compare our results with previously published methods and achieve state-of-the-art performance on the validation set.
- E. C. Cherry, “Some experiments on the recognition of speech, with one and with two ears,” The Journal of the acoustical society of America, 1953.
- N. Mesgarani and E. F. Chang, “Selective cortical representation of attended speaker in multi-talker speech perception,” Nature, 2012.
- F. Rieke, D. Bodnar, and W. Bialek, “Naturalistic stimuli increase the rate and efficiency of information transmission by primary auditory afferents,” Proceedings of the Royal Society of London. Series B: Biological Sciences, 1995.
- G. B. Stanley, F. F. Li, and Y. Dan, “Reconstruction of natural scenes from ensemble responses in the lateral geniculate nucleus,” Journal of Neuroscience, 1999.
- N. Mesgarani, S. V. David, J. B. Fritz, and S. A. Shamma, “Influence of context and behavior on stimulus reconstruction from neural activity in primary auditory cortex,” Journal of neurophysiology, 2009.
- B. N. Pasley et al., “Reconstructing speech from human auditory cortex,” PLoS biology, 2012.
- E. M. Z. Golumbic et al., “Mechanisms underlying selective neuronal tracking of attended speech at a “cocktail party”,” Neuron, 2013.
- A. de Cheveigné et al., “Decoding the auditory brain with canonical component analysis,” NeuroImage, 2018.
- E. Alickovic et al., “A tutorial on auditory attention identification methods,” Frontiers in neuroscience, 2019.
- T. de Taillez, B. Kollmeier, and B. T. Meyer, “Machine learning for decoding listeners’ attention from electroencephalography evoked by continuous speech,” European Journal of Neuroscience, 2020.
- S. Vandecappelle et al., “Eeg-based detection of the locus of auditory attention with convolutional neural networks,” Elife, 2021.
- S. Cai, E. Su, Y. Song, L. Xie, and H. Li, “Low latency auditory attention detection with common spatial pattern analysis of eeg signals.,” in INTERSPEECH, 2020.
- G. Ciccarelli et al., “Comparison of two-talker attention decoding from eeg with nonlinear neural networks and linear methods,” Scientific reports, 2019.
- S. Geirnaert et al., “Electroencephalography-based auditory attention decoding: Toward neurosteered hearing devices,” IEEE Signal Processing Magazine, 2021.
- S. Cai et al., “Rgcnet: An efficient recursive gated convolutional network for eeg-based auditory attention detection,” in 2023 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), IEEE, 2023.
- R. Wang, S. Cai, and H. Li, “Eeg-based auditory attention detection with spatiotemporal graph and graph convolutional network,”
- S. Cai et al., “Eeg-based auditory attention detection via frequency and channel neural attention,” IEEE Transactions on Human-Machine Systems, 2021.
- I. Kuruvila et al., “Extracting the auditory attention in a dual-speaker scenario from eeg using a joint cnn-lstm model,” Frontiers in Physiology, p. 700655, 2021.
- S. Cai et al., “Brain topology modeling with eeg-graphs for auditory spatial attention detection,” IEEE Transactions on Biomedical Engineering, 2023.
- P. Li et al., “A biologically inspired attention network for eeg-based auditory attention detection,” IEEE Signal Processing Letters, 2021.
- E. Su et al., “Stanet: A spatiotemporal attention network for decoding auditory spatial attention from eeg,” IEEE Transactions on Biomedical Engineering, 2022.
- S. Geirnaert et al., “Unsupervised self-adaptive auditory attention decoding,” IEEE journal of biomedical and health informatics, 2021.
- C. Beauchene et al., “Subject-specific adaptation for a causally-trained auditory-attention decoding system,” in ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, 2023.
- S. Cai, P. Li, E. Su, and L. Xie, “Auditory attention detection via cross-modal attention,” Frontiers in neuroscience, 2021.
- A. Vaswani et al., “Attention is all you need,” Advances in neural information processing systems, 2017.
- X. Chen et al., “Auditory attention decoding with task-related multi-view contrastive learning,” in Proceedings of the 31st ACM International Conference on Multimedia, 2023.
- Z. J. Koles et al., “Spatial patterns underlying population differences in the background eeg,” 1990.
- K. Yang et al., “Auditory attention detection in real-life scenarios using common spatial patterns from eeg,” in INTERSPEECH, 2023.
- T. Chen et al., “A simple framework for contrastive learning of visual representations,” in International conference on machine learning, PMLR, 2020.
- X. Chen and K. He, “Exploring simple siamese representation learning,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021.
- D. D. Wong, S. A. Fuglsang, et al., “A comparison of regularization methods in forward and backward models for auditory attention decoding,” Frontiers in neuroscience, vol. 12, p. 531, 2018.
- W. Biesmans et al., “Auditory-inspired speech envelope extraction methods for improved eeg-based auditory attention detection in a cocktail party scenario,” IEEE Transactions on Neural Systems and Rehabilitation Engineering, 2016.
- D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.