Spectral modification based data augmentation for improving end-to-end ASR for children's speech

VP Singh, H Sailor, S Bhattacharya… - arXiv preprint arXiv …, 2022 - arxiv.org
Training a robust Automatic Speech Recognition (ASR) system for children's speech
recognition is a challenging task due to inherent differences in acoustic attributes of adult …

No Pitch Left Behind: Addressing Gender Unbalance in Automatic Speech Recognition through Pitch Manipulation

D Fucci, M Gaido, M Negri, M Cettolo… - 2023 IEEE Automatic …, 2023 - ieeexplore.ieee.org
Automatic speech recognition (ASR) systems are known to be sensitive to the sociolinguistic
variability of speech data, in which gender plays a crucial role. This can result in disparities …

ChildAugment: Data augmentation methods for zero-resource children's speaker verification

VP Singh, M Sahidullah, T Kinnunen - The Journal of the Acoustical …, 2024 - pubs.aip.org
The accuracy of modern automatic speaker verification (ASV) systems, when trained
exclusively on adult data, drops substantially when applied to children's speech. The …

Speaking of accent: A content analysis of accent misconceptions in ASR research

K Prinos, N Patwari, CA Power - The 2024 ACM Conference on Fairness …, 2024 - dl.acm.org
Automatic speech recognition (ASR) researchers are working to address the differing
transcription performance of ASR by accent or dialect. However, research often has a limited …

Augmented dialectal speech recognition for AI-based neuropsychological scale assessment in Alzheimer's disease

M Zhang, Q Cui, W Li, W Yu, L Chen, W Li… - … Signal Processing and …, 2025 - Elsevier
Alzheimer's disease (AD) is a prevalent and widespread neurodegenerative disorder among
the older adult population worldwide. Among the numerous cognitive screening methods …

Leveraging multiple sources in automatic African American English dialect detection for adults and children

A Johnson, VM Shetty, M Ostendorf… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
This paper 1 presents a novel system which utilizes acoustic, phonological,
morphosyntactic, and prosodic information for binary automatic dialect detection of African …

Using modified adult speech as data augmentation for child speech recognition

Z Fan, X Cao, G Salvi… - ICASSP 2023-2023 IEEE …, 2023 - ieeexplore.ieee.org
Data augmentation is a technique which enhances the size and quality of training data such
that deep learning or machine learning models can achieve better performance. This paper …

Exploring the Role of Data Augmentation and Acoustic Feature Concatenation in the Context of Zero-Resource Children's ASR

Ankita, S Shahnawazuddin - Circuits, Systems, and Signal Processing, 2024 - Springer
Our present work studies the impact of employing out-of-domain data augmentation and
front-end acoustic features concatenation on zero-resource children's automatic speech …

An exploratory study on dialect density estimation for children and adult's African American English

A Johnson, NB Shankar, M Ostendorf… - The Journal of the …, 2024 - pubs.aip.org
This paper evaluates an innovative framework for spoken dialect density prediction on
children's and adults' African American English. A speaker's dialect density is defined as the …

The JIBO Kids Corpus: A speech dataset of child-robot interactions in a classroom environment

NB Shankar, A Afshan, A Johnson, A Mahapatra… - JASA Express …, 2024 - pubs.aip.org
balaji1312@ ucla. edu; amberafshan@ ucla. edu; ajohnson49@ ucla. edu; aurosweta99@
ucla. edu; alemartin@ ucla. edu; michaelni12@ ucla. edu; haewon@ mit. edu; mquint30 …