[PDF][PDF] An Improved Goodness of Pronunciation (GoP) Measure for Pronunciation Evaluation with DNN-HMM System Considering HMM Transition Probabilities.

S Sudhakara, MK Ramanathi, C Yarra, PK Ghosh - INTERSPEECH, 2019 - academia.edu
Goodness of pronunciation (GoP) is typically formulated with Gaussian mixture model-
hidden Markov model (GMM-HMM) based acoustic models considering HMM state transition …

Unsupervised pre-trained filter learning approach for efficient convolution neural network

S ur Rehman, S Tu, M Waqas, Y Huang, O ur Rehman… - Neurocomputing, 2019 - Elsevier
Abstract The concept of Convolution Neural Network (ConvNet or CNN) is evaluated from
the animal visual cortex. Since humans can learn through experience, similarly, ConvNet …

Multiple proposals for continuous arabic sign language recognition

M Hassan, K Assaleh, T Shanableh - Sensing and Imaging, 2019 - Springer
The deaf community relies on sign language as the primary means of communication. For
the millions of people around the world who suffer from hearing loss, interaction with hearing …

GFCC based discriminatively trained noise robust continuous ASR system for Hindi language

M Dua, RK Aggarwal, M Biswas - Journal of Ambient Intelligence and …, 2019 - Springer
A statistically designed Automatic Speech Recognition (ASR) system extracts features from
speech signals using feature extraction methods, links the extracted features with the …

Bi-directional lattice recurrent neural networks for confidence estimation

Q Li, PM Ness, A Ragni… - ICASSP 2019-2019 IEEE …, 2019 - ieeexplore.ieee.org
The standard approach to mitigate errors made by an automatic speech recognition system
is to use confidence scores associated with each predicted word. In the simplest case, these …

Discriminative training using noise robust integrated features and refined HMM modeling

M Dua, RK Aggarwal, M Biswas - Journal of Intelligent Systems, 2019 - degruyter.com
The classical approach to build an automatic speech recognition (ASR) system uses
different feature extraction methods at the front end and various parameter classification …

Integrating source-channel and attention-based sequence-to-sequence models for speech recognition

Q Li, C Zhang, PC Woodland - 2019 IEEE Automatic Speech …, 2019 - ieeexplore.ieee.org
This paper proposes a novel automatic speech recognition (ASR) framework called
Integrated Source-Channel and Attention (ISCA) that combines the advantages of traditional …

Challenging the boundaries of speech recognition: the MALACH corpus

M Picheny, Z Tüske, B Kingsbury, K Audhkhasi… - arXiv preprint arXiv …, 2019 - arxiv.org
There has been huge progress in speech recognition over the last several years. Tasks
once thought extremely difficult, such as SWITCHBOARD, now approach levels of human …

Promising accurate prefix boosting for sequence-to-sequence asr

MK Baskar, L Burget, S Watanabe… - ICASSP 2019-2019 …, 2019 - ieeexplore.ieee.org
In this paper, we present promising accurate prefix boosting (PAPB), a discriminative
training technique for attention based sequence-to-sequence (seq2seq) ASR. PAPB is …

PyHTK: Python library and ASR pipelines for HTK

C Zhang, FL Kreyssig, Q Li… - ICASSP 2019-2019 …, 2019 - ieeexplore.ieee.org
This paper describes PyHTK, which is a Python-based library and associated pipeline to
facilitate the construction of large-scale complex automatic speech recognition (ASR) …