Minimum classification error rate methods for speech recognition

L Deng - APSIPA transactions on Signal and Information …, 2014 - cambridge.org

In this invited paper, my overview material on the same topic as presented in the plenary
overview session of APSIPA-2011 and the tutorial material presented in the same …

被引用次数：974 相关文章所有 4 个版本

[PDF] ntua.gr

Machine learning paradigms for speech recognition: An overview

L Deng, X Li - IEEE Transactions on Audio, Speech, and …, 2013 - ieeexplore.ieee.org

Automatic Speech Recognition (ASR) has historically been a driving force behind many
machine learning (ML) techniques, including the ubiquitously used hidden Markov model …

被引用次数：579 相关文章所有 11 个版本

[PDF] thecvf.com

Stacked cross attention for image-text matching

KH Lee, X Chen, G Hua, H Hu… - Proceedings of the …, 2018 - openaccess.thecvf.com

In this paper, we study the problem of image-text matching. Inferring the latent semantic
alignment between objects or other salient stuff (eg snow, sky, lawn) and the corresponding …

被引用次数：1327 相关文章所有 8 个版本

[PDF] thecvf.com

Attngan: Fine-grained text to image generation with attentional generative adversarial networks

T Xu, P Zhang, Q Huang, H Zhang… - Proceedings of the …, 2018 - openaccess.thecvf.com

In this paper, we propose an Attentional Generative Adversarial Network (AttnGAN) that
allows attention-driven, multi-stage refinement for fine-grained text-to-image generation …

被引用次数：2016 相关文章所有 16 个版本

[PDF] nowpublishers.com

Deep learning: methods and applications

L Deng, D Yu - Foundations and trends® in signal processing, 2014 - nowpublishers.com

This monograph provides an overview of general deep learning methodology and its
applications to a variety of signal and information processing tasks. The application areas …

被引用次数：5948 相关文章所有 13 个版本

[PDF] academia.edu

Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition

GE Dahl, D Yu, L Deng, A Acero - IEEE Transactions on audio …, 2011 - ieeexplore.ieee.org

We propose a novel context-dependent (CD) model for large-vocabulary speech recognition
(LVSR) that leverages recent advances in using deep belief networks for phone recognition …

被引用次数：3976 相关文章所有 19 个版本

[PDF] nii.ac.jp

Statistical parametric speech synthesis

H Zen, K Tokuda, AW Black - speech communication, 2009 - Elsevier

This review gives a general overview of techniques used in statistical parametric speech
synthesis. One instance of these techniques, called hidden Markov model (HMM)-based …

被引用次数：1634 相关文章所有 25 个版本

[PDF] lecun.com

[PDF][PDF] A tutorial on energy-based learning

Y LeCun, S Chopra, R Hadsell, M Ranzato… - Predicting structured …, 2006 - yann.lecun.com

Abstract Energy-Based Models (EBMs) capture dependencies between variables by
associating a scalar energy to each configuration of the variables. Inference consists in …

被引用次数：1687 相关文章所有 11 个版本

[PDF] arxiv.org

Speech recognition by machine, a review

MA Anusuya, SK Katti - arXiv preprint arXiv:1001.2267, 2010 - arxiv.org

This paper presents a brief survey on Automatic Speech Recognition and discusses the
major themes and advances made in the past 60 years of research, so as to provide a …

被引用次数：735 相关文章所有 2 个版本

[PDF] ieee.org

Spoken language recognition: from fundamentals to practice

H Li, B Ma, KA Lee - Proceedings of the IEEE, 2013 - ieeexplore.ieee.org

Spoken language recognition refers to the automatic process through which we determine
or verify the identity of the language spoken in a speech sample. We study a computational …

被引用次数：364 相关文章所有 8 个版本

高级搜索

QQ 群