A tutorial survey of architectures, algorithms, and applications for deep learning

L Deng - APSIPA transactions on Signal and Information …, 2014 - cambridge.org
In this invited paper, my overview material on the same topic as presented in the plenary
overview session of APSIPA-2011 and the tutorial material presented in the same …

Machine learning paradigms for speech recognition: An overview

L Deng, X Li - IEEE Transactions on Audio, Speech, and …, 2013 - ieeexplore.ieee.org
Automatic Speech Recognition (ASR) has historically been a driving force behind many
machine learning (ML) techniques, including the ubiquitously used hidden Markov model …

Stacked cross attention for image-text matching

KH Lee, X Chen, G Hua, H Hu… - Proceedings of the …, 2018 - openaccess.thecvf.com
In this paper, we study the problem of image-text matching. Inferring the latent semantic
alignment between objects or other salient stuff (eg snow, sky, lawn) and the corresponding …

Attngan: Fine-grained text to image generation with attentional generative adversarial networks

T Xu, P Zhang, Q Huang, H Zhang… - Proceedings of the …, 2018 - openaccess.thecvf.com
In this paper, we propose an Attentional Generative Adversarial Network (AttnGAN) that
allows attention-driven, multi-stage refinement for fine-grained text-to-image generation …

Deep learning: methods and applications

L Deng, D Yu - Foundations and trends® in signal processing, 2014 - nowpublishers.com
This monograph provides an overview of general deep learning methodology and its
applications to a variety of signal and information processing tasks. The application areas …

Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition

GE Dahl, D Yu, L Deng, A Acero - IEEE Transactions on audio …, 2011 - ieeexplore.ieee.org
We propose a novel context-dependent (CD) model for large-vocabulary speech recognition
(LVSR) that leverages recent advances in using deep belief networks for phone recognition …

Statistical parametric speech synthesis

H Zen, K Tokuda, AW Black - speech communication, 2009 - Elsevier
This review gives a general overview of techniques used in statistical parametric speech
synthesis. One instance of these techniques, called hidden Markov model (HMM)-based …

[PDF][PDF] A tutorial on energy-based learning

Y LeCun, S Chopra, R Hadsell, M Ranzato… - Predicting structured …, 2006 - yann.lecun.com
Abstract Energy-Based Models (EBMs) capture dependencies between variables by
associating a scalar energy to each configuration of the variables. Inference consists in …

Speech recognition by machine, a review

MA Anusuya, SK Katti - arXiv preprint arXiv:1001.2267, 2010 - arxiv.org
This paper presents a brief survey on Automatic Speech Recognition and discusses the
major themes and advances made in the past 60 years of research, so as to provide a …

Spoken language recognition: from fundamentals to practice

H Li, B Ma, KA Lee - Proceedings of the IEEE, 2013 - ieeexplore.ieee.org
Spoken language recognition refers to the automatic process through which we determine
or verify the identity of the language spoken in a speech sample. We study a computational …