[PDF][PDF] Three classes of deep learning architectures and their applications: a tutorial survey

L Deng - APSIPA transactions on signal and information …, 2012 - academia.edu
In this invited paper, my overview material on the same topic as presented in the plenary
overview session of APSIPA-2011 and the tutorial material presented in the same …

Large-vocabulary continuous speech recognition systems: A look at some recent advances

G Saon, JT Chien - IEEE signal processing magazine, 2012 - ieeexplore.ieee.org
Over the past decade or so, several advances have been made to the design of modern
large vocabulary continuous speech recognition (LVCSR) systems to the point where their …

[PDF][PDF] Scalable Minimum Bayes Risk Training of Deep Neural Network Acoustic Models Using Distributed Hessian-free Optimization.

B Kingsbury, TN Sainath, H Soltau - Interspeech, 2012 - isca-archive.org
Training neural network acoustic models with sequencediscriminative criteria, such as state-
level minimum Bayes risk (sMBR), been shown to produce large improvements in …

[PDF][PDF] Structured ramp loss minimization for machine translation

K Gimpel, NA Smith - Proceedings of the 2012 Conference of the …, 2012 - aclanthology.org
This paper seeks to close the gap between training algorithms used in statistical machine
translation and machine learning, specifically the framework of empirical risk minimization …

[PDF][PDF] Fundamental technologies in modern speech recognition

T OCKPH - IEEE Signal Processing Magazine, 2012 - Citeseer
There is a vast body of literature on LVCSR research and some limitation is necessary in the
scope of this article. We will focus primarily on the techniques that have been successful in …

Research on speech recognition technology and its application

Y Yu - 2012 international conference on computer science …, 2012 - ieeexplore.ieee.org
Speech recognition is a kind of technology that is using computer to transfer the voice signal
to an associated text or command by identification and understand. The paper depicts the …

[PDF][PDF] Word-level acoustic modeling with convolutional vector regression

AL Maas, SD Miller, TM O'neil, AY Ng… - Proc. ICML Workshop …, 2012 - ai.stanford.edu
We introduce a model that maps variablelength word utterances to a word vector space
using convolutional neural networks. Convolutional networks are a rich class of architecture …

Transcription of multi-genre media archives using out-of-domain data

PJ Bell, MJF Gales, P Lanchantin, X Liu… - 2012 IEEE Spoken …, 2012 - ieeexplore.ieee.org
We describe our work on developing a speech recognition system for multi-genre media
archives. The high diversity of the data makes this a challenging recognition task, which may …

Structured discriminative models for speech recognition: An overview

MJF Gales, S Watanabe… - IEEE Signal Processing …, 2012 - ieeexplore.ieee.org
Automatic speech recognition (ASR) systems classify structured sequence data, where the
label sequences (sentences) must be inferred from the observation sequences (the acoustic …

Morphological decomposition in Arabic ASR systems

F Diehl, MJF Gales, M Tomalin… - Computer Speech & …, 2012 - Elsevier
In recent years, the use of morphological decomposition strategies for Arabic Automatic
Speech Recognition (ASR) has become increasingly popular. Systems trained on …