Minimum phone error and I-smoothing for improved discriminative training

M Tanveer, A Rastogi, V Paliwal, MA Ganaie, AK Malik… - Neurocomputing, 2023 - Elsevier

Abstract Machine learning methods are extensively used for processing and analysing
speech signals by virtue of their performance gains over multiple domains. Deep learning …

被引用次数：13 相关文章所有 5 个版本

[图书][B] Distant speech recognition

M Wölfel, J McDonough - 2009 - books.google.com

A complete overview of distant automatic speech recognition The performance of
conventional Automatic Speech Recognition (ASR) systems degrades dramatically as soon …

被引用次数：449 相关文章所有 6 个版本

[PDF] whiterose.ac.uk

Speech recognition and keyword spotting for low-resource languages: Babel project research at cued

MJF Gales, KM Knill, A Ragni… - … workshop on spoken …, 2014 - eprints.whiterose.ac.uk

Recently there has been increased interest in Automatic Speech Recognition (ASR) and
Key Word Spotting (KWS) systems for low resource languages. One of the driving forces for …

被引用次数：209 相关文章所有 10 个版本

[PDF] academia.edu

Boosted MMI for model and feature-space discriminative training

D Povey, D Kanevsky, B Kingsbury… - … , Speech and Signal …, 2008 - ieeexplore.ieee.org

We present a modified form of the maximum mutual information (MMI) objective function
which gives improved results for discriminative training. The modification consists of …

被引用次数：496 相关文章所有 14 个版本

Audio-visual deep learning for noise robust speech recognition

J Huang, B Kingsbury - 2013 IEEE international conference on …, 2013 - ieeexplore.ieee.org

Deep belief networks (DBN) have shown impressive improvements over Gaussian mixture
models for automatic speech recognition. In this work we use DBNs for audio-visual speech …

被引用次数：224 相关文章所有 4 个版本

Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling

B Kingsbury - 2009 IEEE International Conference on Acoustics …, 2009 - ieeexplore.ieee.org

Acoustic models used in hidden Markov model/neural-network (HMM/NN) speech
recognition systems are usually trained with a frame-based cross-entropy error criterion. In …

被引用次数：349 相关文章所有 7 个版本

An enhanced stacked LSTM method with no random initialization for malware threat hunting in safety and time-critical systems

AN Jahromi, S Hashemi… - … on Emerging Topics …, 2020 - ieeexplore.ieee.org

Malware detection is an increasingly important operational focus in cyber security,
particularly, given the fast pace of such threats (eg, new malware variants introduced every …

被引用次数：86 相关文章所有 2 个版本

[PDF] isca-archive.org

[PDF][PDF] Scalable Minimum Bayes Risk Training of Deep Neural Network Acoustic Models Using Distributed Hessian-free Optimization.

B Kingsbury, TN Sainath, H Soltau - Interspeech, 2012 - isca-archive.org

Training neural network acoustic models with sequencediscriminative criteria, such as state-
level minimum Bayes risk (sMBR), been shown to produce large improvements in …

被引用次数：280 相关文章所有 7 个版本

[PDF] psu.edu

[PDF][PDF] Hidden conditional random fields for phone classification.

A Gunawardana, M Mahajan, A Acero, JC Platt - Interspeech, 2005 - Citeseer

In this paper, we show the novel application of hidden conditional random fields (HCRFs)–
conditional random fields with hidden state sequences–for modeling speech. Hidden state …

被引用次数：443 相关文章所有 9 个版本

[PDF] researchgate.net

Maximum F1-score discriminative training criterion for automatic mispronunciation detection

H Huang, H Xu, X Wang… - IEEE/ACM Transactions on …, 2015 - ieeexplore.ieee.org

We carry out an in-depth investigation on a newly proposed Maximum F1-score Criterion
(MFC) discriminative training objective function for Goodness of Pronunciation (GOP) based …

被引用次数：149 相关文章所有 4 个版本

高级搜索

QQ 群