[PDF][PDF] A review of speech-centric trustworthy machine learning: Privacy, safety, and fairness

T Feng, R Hebbar, N Mehlman, X Shi… - … on Signal and …, 2023 - nowpublishers.com
Speech-centric machine learning systems have revolutionized a number of leading
industries ranging from transportation and healthcare to education and defense …

Deep representation learning in speech processing: Challenges, recent advances, and future trends

S Latif, R Rana, S Khalifa, R Jurdak, J Qadir… - arXiv preprint arXiv …, 2020 - arxiv.org
Research on speech processing has traditionally considered the task of designing hand-
engineered acoustic features (feature engineering) as a separate distinct problem from the …

Massively multilingual adversarial speech recognition

O Adams, M Wiesner, S Watanabe… - arXiv preprint arXiv …, 2019 - arxiv.org
We report on adaptation of multilingual end-to-end speech recognition models trained on as
many as 100 languages. Our findings shed light on the relative importance of similarity …

Universlu: Universal spoken language understanding for diverse classification and sequence generation tasks with a single network

S Arora, H Futami, J Jung, Y Peng, R Sharma… - arXiv preprint arXiv …, 2023 - arxiv.org
Recent studies have demonstrated promising outcomes by employing large language
models with multi-tasking capabilities. They utilize prompts to guide the model's behavior …

End-to-end domain-adversarial voice activity detection

M Lavechin, MP Gill, R Bousbib, H Bredin… - arXiv preprint arXiv …, 2019 - arxiv.org
Voice activity detection is the task of detecting speech regions in a given audio stream or
recording. First, we design a neural network combining trainable filters and recurrent layers …

A study of bias mitigation strategies for speaker recognition

R Peri, K Somandepalli, S Narayanan - Computer Speech & Language, 2023 - Elsevier
Speaker recognition is increasingly used in several everyday applications including smart
speakers, customer care centers and other speech-driven analytics. It is crucial to accurately …

Best of both worlds: Robust accented speech recognition with adversarial transfer learning

N Das, S Bodapati, M Sunkara, S Srinivasan… - arXiv preprint arXiv …, 2021 - arxiv.org
Training deep neural networks for automatic speech recognition (ASR) requires large
amounts of transcribed speech. This becomes a bottleneck for training robust models for …

Unsupervised domain adaptation schemes for building ASR in low-resource languages

CS Anoop, AP Prathosh… - 2021 IEEE Automatic …, 2021 - ieeexplore.ieee.org
Building an automatic speech recognition (ASR) system from scratch requires a large
amount of annotated speech data, which is difficult to collect in many languages. However …

Adversarial regularization for attention based end-to-end robust speech recognition

S Sun, P Guo, L Xie, MY Hwang - IEEE/ACM Transactions on …, 2019 - ieeexplore.ieee.org
End-to-end speech recognition, such as attention based approaches, is an emerging and
attractive topic in recent years. It has achieved comparable performance with the traditional …

Disentangled speaker and nuisance attribute embedding for robust speaker verification

WH Kang, SH Mun, MH Han, NS Kim - IEEE Access, 2020 - ieeexplore.ieee.org
Over the recent years, various deep learning-based embedding methods have been
proposed and have shown impressive performance in speaker verification. However, as in …