Hybrid convolutional neural networks for articulatory and acoustic information based speech recognition

V Mitra, G Sivaraman, H Nam, C Espy-Wilson… - Speech …, 2017 - Elsevier
Studies have shown that articulatory information helps model speech variability and,
consequently, improves speech recognition performance. But learning speaker-invariant …

Harmonic attention for monaural speech enhancement

T Wang, W Zhu, Y Gao, S Zhang… - IEEE/ACM Transactions …, 2023 - ieeexplore.ieee.org
To further improve the quality of the enhanced speech, it is appealing that more profound
articulatory and auditory knowledge should be introduced into the speech enhancement …

The SRI AVEC-2014 evaluation system

V Mitra, E Shriberg, M McLaren, A Kathol… - Proceedings of the 4th …, 2014 - dl.acm.org
Though depression is a common mental health problem with significant impact on human
society, it often goes undetected. We explore a diverse set of features based only on spoken …

Articulatory and bottleneck features for speaker-independent ASR of dysarthric speech

E Yılmaz, V Mitra, G Sivaraman, H Franco - Computer Speech & Language, 2019 - Elsevier
The rapid population aging has stimulated the development of assistive devices that provide
personalized medical support to the needies suffering from various etiologies. One …

[PDF][PDF] A multimodal real-time MRI articulatory corpus for speech research

S Narayanan, E Bresch, PK Ghosh… - … Annual Conference of …, 2011 - sail.usc.edu
We present MRI-TIMIT: a large-scale database of synchronized audio and real-time
magnetic resonance imaging (rtMRI) data for speech research. The database currently …

Acoustic-to-articulatory mapping with joint optimization of deep speech enhancement and articulatory inversion models

AS Shahrebabaki, G Salvi, T Svendsen… - … on Audio, Speech …, 2021 - ieeexplore.ieee.org
We investigate the problem of speaker independent acoustic-to-articulatory inversion (AAI)
in noisy conditions within the deep neural network (DNN) framework. In contrast with recent …

Exploiting cross domain acoustic-to-articulatory inverted features for disordered speech recognition

S Hu, S Liu, X Xie, M Geng, T Wang… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org
Articulatory features are inherently invariant to acoustic signal distortion and have been
successfully incorporated into automatic speech recognition (ASR) systems for normal …

Articulatory features from deep neural networks and their role in speech recognition

V Mitra, G Sivaraman, H Nam… - … on acoustics, speech …, 2014 - ieeexplore.ieee.org
This paper presents a deep neural network (DNN) to extract articulatory information from the
speech signal and explores different ways to use such information in a continuous speech …

Cross-domain deep visual feature generation for mandarin audio–visual speech recognition

R Su, X Liu, L Wang, J Yang - IEEE/ACM Transactions on …, 2019 - ieeexplore.ieee.org
There has been a long term interest in using visual information to improve automatic speech
recognition (ASR) system performance. Both audio and visual information are required in …

[PDF][PDF] A comparative study of LPCC and MFCC features for the recognition of Assamese phonemes

U Bhattacharjee - International Journal of Engineering Research & …, 2013 - academia.edu
In this paper two popular feature extraction techniques Linear Predictive Cepstral
Coefficients (LPCC) and Mel Frequency Cepstral Coefficients (MFCC) have been …