A tutorial survey of architectures, algorithms, and applications for deep learning

L Deng - APSIPA transactions on Signal and Information …, 2014 - cambridge.org
In this invited paper, my overview material on the same topic as presented in the plenary
overview session of APSIPA-2011 and the tutorial material presented in the same …

Machine learning paradigms for speech recognition: An overview

L Deng, X Li - IEEE Transactions on Audio, Speech, and …, 2013 - ieeexplore.ieee.org
Automatic Speech Recognition (ASR) has historically been a driving force behind many
machine learning (ML) techniques, including the ubiquitously used hidden Markov model …

State-of-the-art speech recognition with sequence-to-sequence models

CC Chiu, TN Sainath, Y Wu… - … on acoustics, speech …, 2018 - ieeexplore.ieee.org
Attention-based encoder-decoder architectures such as Listen, Attend, and Spell (LAS),
subsume the acoustic, pronunciation and language model components of a traditional …

Deep learning: methods and applications

L Deng, D Yu - Foundations and trends® in signal processing, 2014 - nowpublishers.com
This monograph provides an overview of general deep learning methodology and its
applications to a variety of signal and information processing tasks. The application areas …

Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition

GE Dahl, D Yu, L Deng, A Acero - IEEE Transactions on audio …, 2011 - ieeexplore.ieee.org
We propose a novel context-dependent (CD) model for large-vocabulary speech recognition
(LVSR) that leverages recent advances in using deep belief networks for phone recognition …

Why does unsupervised pre-training help deep learning?

D Erhan, A Courville, Y Bengio… - Proceedings of the …, 2010 - proceedings.mlr.press
Much recent research has been devoted to learning algorithms for deep architectures such
as Deep Belief Networks and stacks of auto-encoder variants with impressive results being …

Parallel training of DNNs with natural gradient and parameter averaging

D Povey, X Zhang, S Khudanpur - arXiv preprint arXiv:1410.7455, 2014 - arxiv.org
We describe the neural-network training framework used in the Kaldi speech recognition
toolkit, which is geared towards training DNNs with large amounts of training data using …

The application of hidden Markov models in speech recognition

M Gales, S Young - Foundations and Trends® in Signal …, 2008 - nowpublishers.com
The Application of Hidden Markov Models in Speech Recognition Page 1 The Application of
Hidden Markov Models in Speech Recognition Full text available at: http://dx.doi.org/10.1561/2000000004 …

Improved mispronunciation detection with deep neural network trained acoustic models and transfer learning based logistic regression classifiers

W Hu, Y Qian, FK Soong, Y Wang - Speech Communication, 2015 - Elsevier
Mispronunciation detection is an important part in a Computer-Aided Language Learning
(CALL) system. By automatically pointing out where mispronunciations occur in an …

Feature learning in deep neural networks-studies on speech recognition tasks

D Yu, ML Seltzer, J Li, JT Huang, F Seide - arXiv preprint arXiv:1301.3605, 2013 - arxiv.org
Recent studies have shown that deep neural networks (DNNs) perform significantly better
than shallow networks and Gaussian mixture models (GMMs) on large vocabulary speech …