Speech emotion recognition approaches: A systematic review

A Hashem, M Arif, M Alghamdi - Speech Communication, 2023 - Elsevier
The speech emotion recognition (SER) field has been active since it became a crucial
feature in advanced Human-Computer Interaction (HCI), and wide real-life applications use …

[HTML][HTML] Speech emotion recognition using machine learning—A systematic review

S Madanian, T Chen, O Adeleye, JM Templeton… - Intelligent systems with …, 2023 - Elsevier
Speech emotion recognition (SER) as a Machine Learning (ML) problem continues to
garner a significant amount of research interest, especially in the affective computing …

Dawn of the transformer era in speech emotion recognition: closing the valence gap

J Wagner, A Triantafyllopoulos… - … on Pattern Analysis …, 2023 - ieeexplore.ieee.org
Recent advances in transformer-based architectures have shown promise in several
machine learning tasks. In the audio domain, such architectures have been successfully …

A fine-tuned wav2vec 2.0/hubert benchmark for speech emotion recognition, speaker verification and spoken language understanding

Y Wang, A Boumadane, A Heba - arXiv preprint arXiv:2111.02735, 2021 - arxiv.org
Speech self-supervised models such as wav2vec 2.0 and HuBERT are making revolutionary
progress in Automatic Speech Recognition (ASR). However, they have not been totally …

Automatic speaker verification spoofing and deepfake detection using wav2vec 2.0 and data augmentation

H Tak, M Todisco, X Wang, J Jung, J Yamagishi… - arXiv preprint arXiv …, 2022 - arxiv.org
The performance of spoofing countermeasure systems depends fundamentally upon the use
of sufficiently representative training data. With this usually being limited, current solutions …

Speech emotion recognition using self-supervised features

E Morais, R Hoory, W Zhu, I Gat… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org
Self-supervised pre-trained features have consistently delivered state-of-art results in the
field of natural language processing (NLP); however, their merits in the field of speech …

Exploring wav2vec 2.0 fine tuning for improved speech emotion recognition

LW Chen, A Rudnicky - ICASSP 2023-2023 IEEE International …, 2023 - ieeexplore.ieee.org
While Wav2Vec 2.0 has been proposed for speech recognition (ASR), it can also be used for
speech emotion recognition (SER); its performance can be significantly improved using …

Large-scale self-supervised speech representation learning for automatic speaker verification

Z Chen, S Chen, Y Wu, Y Qian, C Wang… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org
The speech representations learned from large-scale unlabeled data have shown better
generalizability than those from supervised learning and thus attract a lot of interest to be …

Transformers in speech processing: A survey

S Latif, A Zaidi, H Cuayahuitl, F Shamshad… - arXiv preprint arXiv …, 2023 - arxiv.org
The remarkable success of transformers in the field of natural language processing has
sparked the interest of the speech-processing community, leading to an exploration of their …

Fine-tuning wav2vec2 for speaker recognition

N Vaessen, DA Van Leeuwen - ICASSP 2022-2022 IEEE …, 2022 - ieeexplore.ieee.org
This paper explores applying the wav2vec2 framework to speaker recognition instead of
speech recognition. We study the effectiveness of the pre-trained weights on the speaker …