Recent advances in deep learning for speech research at Microsoft

L Deng, J Li, JT Huang, K Yao, D Yu… - … on acoustics, speech …, 2013 - ieeexplore.ieee.org
Deep learning is becoming a mainstream technology for speech recognition at industrial
scale. In this paper, we provide an overview of the work by Microsoft speech researchers …

New types of deep neural network learning for speech recognition and related applications: An overview

L Deng, G Hinton, B Kingsbury - 2013 IEEE international …, 2013 - ieeexplore.ieee.org
In this paper, we provide an overview of the invited and contributed papers presented at the
special session at ICASSP-2013, entitled “New Types of Deep Neural Network Learning for …

On rectified linear units for speech processing

MD Zeiler, M Ranzato, R Monga, M Mao… - … , Speech and Signal …, 2013 - ieeexplore.ieee.org
Deep neural networks have recently become the gold standard for acoustic modeling in
speech recognition systems. The key computational unit of a deep network is a linear …

Deep learning: from speech recognition to language and multimodal processing

L Deng - APSIPA Transactions on Signal and Information …, 2016 - cambridge.org
While artificial neural networks have been in existence for over half a century, it was not until
year 2010 that they had made a significant impact on speech recognition with a deep form of …

Speech acoustic modeling from raw multichannel waveforms

Y Hoshen, RJ Weiss, KW Wilson - 2015 IEEE international …, 2015 - ieeexplore.ieee.org
Standard deep neural network-based acoustic models for automatic speech recognition
(ASR) rely on hand-engineered input features, typically log-mel filterbank magnitudes. In this …

End-to-end neural network based automated speech scoring

L Chen, J Tao, S Ghaffarzadegan… - 2018 IEEE international …, 2018 - ieeexplore.ieee.org
In recent years, machine learning models for automated speech scoring systems were
mainly built using data-driven approaches with handcrafted features as one of the main …

Deep beamforming networks for multi-channel speech recognition

X Xiao, S Watanabe, H Erdogan, L Lu… - … , Speech and Signal …, 2016 - ieeexplore.ieee.org
Despite the significant progress in speech recognition enabled by deep neural networks,
poor performance persists in some scenarios. In this work, we focus on far-field speech …

An analysis of convolutional neural networks for speech recognition

JT Huang, J Li, Y Gong - 2015 IEEE International Conference …, 2015 - ieeexplore.ieee.org
Despite the fact that several sites have reported the effectiveness of convolutional neural
networks (CNNs) on some tasks, there is no deep analysis regarding why CNNs perform …

Learning filter banks within a deep neural network framework

TN Sainath, B Kingsbury, A Mohamed… - 2013 IEEE workshop …, 2013 - ieeexplore.ieee.org
Mel-filter banks are commonly used in speech recognition, as they are motivated from theory
related to speech production and perception. While features derived from mel-filter banks …

Speaker adaptation of context dependent deep neural networks

H Liao - 2013 IEEE International Conference on Acoustics …, 2013 - ieeexplore.ieee.org
There has been little work on examining how deep neural networks may be adapted to
speakers for improved speech recognition accuracy. Past work has examined using a …