Multi-style training for robust isolated-word speech recognition

J Li, L Deng, Y Gong… - IEEE/ACM Transactions …, 2014 - ieeexplore.ieee.org

New waves of consumer-centric applications, such as voice search and voice interaction
with mobile devices and home entertainment systems, increasingly require automatic …

被引用次数：680 相关文章所有 9 个版本

[PDF] illinois.edu

A tutorial on hidden Markov models and selected applications in speech recognition

LR Rabiner - Proceedings of the IEEE, 1989 - ieeexplore.ieee.org

This tutorial provides an overview of the basic theory of hidden Markov models (HMMs) as
originated by LE Baum and T. Petrie (1966) and gives practical details on methods of …

被引用次数：33615 相关文章所有 61 个版本

[PDF] danielpovey.com

A study on data augmentation of reverberant speech for robust speech recognition

T Ko, V Peddinti, D Povey, ML Seltzer… - … on acoustics, speech …, 2017 - ieeexplore.ieee.org

The environmental robustness of DNN-based acoustic models can be significantly improved
by using multi-condition training data. However, as data collection is a costly proposition …

被引用次数：1104 相关文章所有 8 个版本

Data augmentation for deep neural network acoustic modeling

X Cui, V Goel, B Kingsbury - IEEE/ACM Transactions on Audio …, 2015 - ieeexplore.ieee.org

This paper investigates data augmentation for deep neural network acoustic modeling
based on label-preserving transformations to deal with data sparsity. Two data …

被引用次数：540 相关文章所有 11 个版本

[PDF] arxiv.org

Exploring speech enhancement with generative adversarial networks for robust speech recognition

C Donahue, B Li, R Prabhavalkar - 2018 IEEE international …, 2018 - ieeexplore.ieee.org

We investigate the effectiveness of generative adversarial networks (GANs) for speech
enhancement, in the context of improving noise robustness of automatic speech recognition …

被引用次数：279 相关文章所有 6 个版本

[PDF] arxiv.org

Recurrent neural network transducer for audio-visual speech recognition

T Makino, H Liao, Y Assael… - 2019 IEEE automatic …, 2019 - ieeexplore.ieee.org

This work presents a large-scale audio-visual speech recognition system based on a
recurrent neural network transducer (RNN-T) architecture. To support the development of …

被引用次数：157 相关文章所有 3 个版本

[PDF] telkomuniversity.ac.id

Hidden Markov models for speech recognition

BH Juang, LR Rabiner - Technometrics, 1991 - Taylor & Francis

The use of hidden Markov models for speech recognition has become predominant in the
last several years, as evidenced by the number of published papers and talks at major …

被引用次数：2616 相关文章所有 17 个版本

[PDF] hal.science

Automatic speech recognition and speech variability: A review

M Benzeghiba, R De Mori, O Deroo, S Dupont… - Speech …, 2007 - Elsevier

Major progress is being recorded regularly on both the technology and exploitation of
automatic speech recognition (ASR) and spoken language systems. However, there are still …

被引用次数：752 相关文章所有 24 个版本

[PDF] arxiv.org

VoiceFilter-Lite: Streaming targeted voice separation for on-device speech recognition

Q Wang, IL Moreno, M Saglam, K Wilson… - arXiv preprint arXiv …, 2020 - arxiv.org

We introduce VoiceFilter-Lite, a single-channel source separation model that runs on the
device to preserve only the speech signals from a target user, as part of a streaming speech …

被引用次数：101 相关文章所有 11 个版本

[PDF] googleapis.com

Variable-component deep neural network for robust speech recognition

J Li, R Zhao, Y Gong - US Patent 10,019,990, 2018 - Google Patents

Abstract Systems and methods for speech recognition incorporating environmental variables
are provided. The systems and methods capture speech to be recognized. The speech is …

被引用次数：224 相关文章所有 4 个版本

高级搜索

QQ 群