An overview of noise-robust automatic speech recognition

J Li, L Deng, Y Gong… - IEEE/ACM Transactions …, 2014 - ieeexplore.ieee.org
New waves of consumer-centric applications, such as voice search and voice interaction
with mobile devices and home entertainment systems, increasingly require automatic …

Machine learning paradigms for speech recognition: An overview

L Deng, X Li - IEEE Transactions on Audio, Speech, and …, 2013 - ieeexplore.ieee.org
Automatic Speech Recognition (ASR) has historically been a driving force behind many
machine learning (ML) techniques, including the ubiquitously used hidden Markov model …

Librispeech: an asr corpus based on public domain audio books

V Panayotov, G Chen, D Povey… - 2015 IEEE international …, 2015 - ieeexplore.ieee.org
This paper introduces a new corpus of read English speech, suitable for training and
evaluating speech recognition systems. The LibriSpeech corpus is derived from audiobooks …

Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition

GE Dahl, D Yu, L Deng, A Acero - IEEE Transactions on audio …, 2011 - ieeexplore.ieee.org
We propose a novel context-dependent (CD) model for large-vocabulary speech recognition
(LVSR) that leverages recent advances in using deep belief networks for phone recognition …

Parallel training of DNNs with natural gradient and parameter averaging

D Povey, X Zhang, S Khudanpur - arXiv preprint arXiv:1410.7455, 2014 - arxiv.org
We describe the neural-network training framework used in the Kaldi speech recognition
toolkit, which is geared towards training DNNs with large amounts of training data using …

Maximum likelihood linear transformations for HMM-based speech recognition

MJF Gales - Computer speech & language, 1998 - Elsevier
This paper examines the application of linear transformations for speaker and environmental
adaptation in an HMM-based speech recognition system. In particular, transformations that …

Improving deep neural network acoustic models using generalized maxout networks

X Zhang, J Trmal, D Povey… - 2014 IEEE international …, 2014 - ieeexplore.ieee.org
Recently, maxout networks have brought significant improvements to various speech
recognition and computer vision tasks. In this paper we introduce two new types of …

End-to-end acoustic modeling using convolutional neural networks for HMM-based automatic speech recognition

D Palaz, M Magimai-Doss, R Collobert - Speech Communication, 2019 - Elsevier
In hidden Markov model (HMM) based automatic speech recognition (ASR) system,
modeling the statistical relationship between the acoustic speech signal and the HMM states …

Method and system for non-parametric voice conversion

I Agiomyrgiannakis - US Patent 9,183,830, 2015 - Google Patents
GIOL I5/04(2013.01) A method and system is disclosed for non-parametric speech GIOL
I5/4(2006.01) conversion. A text-to-speech (TTS) synthesis system may GIOL I3/02(2013.01) …

[图书][B] Speech synthesis and recognition

W Holmes - 2002 - taylorfrancis.com
With the growing impact of information technology on daily life, speech is becoming
increasingly important for providing a natural means of communication between humans …