Recent advances in transformer-based architectures have shown promise in several machine learning tasks. In the audio domain, such architectures have been successfully …
Emotion recognition datasets are relatively small, making the use of the more sophisticated deep learning approaches challenging. In this work, we propose a transfer learning method …
Y Shou, T Meng, W Ai, F Zhang, N Yin, K Li - Information Fusion, 2024 - Elsevier
With the rapid development of social media and human–computer interaction, multimodal emotion recognition in conversations (MERC) tasks have begun to receive widespread …
Z Chen, J Li, H Liu, X Wang, H Wang… - Expert Systems with …, 2023 - Elsevier
Speech emotion recognition (SER) has become a crucial topic in the field of human– computer interactions. Feature representation plays an important role in SER, but there are …
Progress in speech processing has been facilitated by shared datasets and benchmarks. Historically these have focused on automatic speech recognition (ASR), speaker …
Y Shou, T Meng, W Ai, N Yin, K Li - arXiv preprint arXiv:2312.16778, 2023 - arxiv.org
With the release of increasing open-source emotion recognition datasets on social media platforms and the rapid development of computing resources, multimodal emotion …
S Srinivasan, Z Huang… - ICASSP 2022-2022 IEEE …, 2022 - ieeexplore.ieee.org
Generic pre-trained speech and text representations promise to reduce the need for large labeled datasets on specific speech and language tasks. However, it is not clear how to …
This paper describes our submission to the ADreSSo Challenge, which focuses on the problem of automatic recognition of Alzheimer's Disease (AD) from speech. The audio …