Dysarthria severity classification using multi-head attention and multi-task learning

AA Joshy, R Rajan - Speech Communication, 2023 - Elsevier
Identifying the severity of dysarthria is considered a diagnostic step in monitoring the
patient's progress and a beneficial step in the transcription of dysarthric speech. In this …

A study of bias mitigation strategies for speaker recognition

R Peri, K Somandepalli, S Narayanan - Computer Speech & Language, 2023 - Elsevier
Speaker recognition is increasingly used in several everyday applications including smart
speakers, customer care centers and other speech-driven analytics. It is crucial to accurately …

Investigating the contribution of speaker attributes to speaker separability using disentangled speaker representations

C Luu, S Renals, P Bell - Interspeech 2022, 2022 - research.ed.ac.uk
Deep speaker embeddings have been shown to encode a wide variety of attributes relating
to a speaker. The aim of this work is to separate out some of these attributes in the …

Disentangled representation learning for multilingual speaker recognition

K Nam, Y Kim, J Huh, HS Heo, J Jung… - arXiv preprint arXiv …, 2022 - arxiv.org
The goal of this paper is to learn robust speaker representation for bilingual speaking
scenario. The majority of the world's population speak at least two languages; however …

[HTML][HTML] Identity, Gender, Age, and Emotion Recognition from Speaker Voice with Multi-task Deep Networks for Cognitive Robotics

P Foggia, A Greco, A Roberto, A Saggese… - Cognitive Computation, 2024 - Springer
This paper presents a study on the use of multi-task neural networks (MTNs) for voice-based
soft biometrics recognition, eg, gender, age, and emotion, in social robots. MTNs enable …

Disentangled Representation Learning for Environment-agnostic Speaker Recognition

KH Nam, HS Heo, J Jung, JS Chung - arXiv preprint arXiv:2406.14559, 2024 - arxiv.org
This work presents a framework based on feature disentanglement to learn speaker
embeddings that are robust to environmental variations. Our framework utilises an auto …

Robust End-to-End Diarization with Domain Adaptive Training and Multi-Task Learning

I Fung, L Samarakoon… - 2023 IEEE Automatic …, 2023 - ieeexplore.ieee.org
Due to the scarcity of publicly available diarization data, the model performance can be
improved by training a single model with data from different domains. In this work, we …

Explainable Attribute-Based Speaker Verification

X Wu, C Luu, P Bell, A Rajan - arXiv preprint arXiv:2405.19796, 2024 - arxiv.org
This paper proposes a fully explainable approach to speaker verification (SV), a task that
fundamentally relies on individual speaker characteristics. The opaque use of speaker …

To train or not to train adversarially: A study of bias mitigation strategies for speaker recognition

R Peri, K Somandepalli, S Narayanan - arXiv preprint arXiv:2203.09122, 2022 - arxiv.org
Speaker recognition is increasingly used in several everyday applications including smart
speakers, customer care centers and other speech-driven analytics. It is crucial to accurately …

Acoustic Model Adaptation In Reverberant Conditions Using Multi-task Learned Embeddings

A Raikar, M Soni, A Panda… - 2022 30th European …, 2022 - ieeexplore.ieee.org
Acoustic environment plays a major role in the performance of a large-scale Automatic
Speech Recognition (ASR) system. It becomes a lot more challenging when substantial …