Image classification using convolutional neural network (CNN) and recurrent neural network (RNN): A review

P Dhruv, S Naskar - … learning and information processing: proceedings of …, 2020 - Springer
With the advent of technologies, real-time data is essentially required for future
development. Everyday, a huge amount of visual data is being collected, but to use it …

Word error rate estimation for speech recognition: e-WER

A Ali, S Renals - Proceedings of the 56th Annual Meeting of the …, 2018 - aclanthology.org
Measuring the performance of automatic speech recognition (ASR) systems requires
manually transcribed data in order to compute the word error rate (WER), which is often time …

Error detection and accuracy estimation in automatic speech recognition using deep bidirectional recurrent neural networks

A Ogawa, T Hori - Speech Communication, 2017 - Elsevier
Recurrent neural networks (RNNs) have recently been applied as the classifiers for
sequential labeling problems. In this paper, deep bidirectional RNNs (DBRNNs) are applied …

Spoken SQuAD: A study of mitigating the impact of speech recognition errors on listening comprehension

CH Li, SL Wu, CL Liu, H Lee - arXiv preprint arXiv:1804.00320, 2018 - arxiv.org
Reading comprehension has been widely studied. One of the most representative reading
comprehension tasks is Stanford Question Answering Dataset (SQuAD), on which machine …

Estimating confidence scores on ASR results using recurrent neural networks

K Kalgaonkar, C Liu, Y Gong… - 2015 IEEE International …, 2015 - ieeexplore.ieee.org
In this paper we present a confidence estimation system using recurrent neural networks
(RNN) and compare it to a traditional multilayered perception (MLP) based system. The …

Analyzing uncertainties in speech recognition using dropout

A Vyas, P Dighe, S Tong… - ICASSP 2019-2019 IEEE …, 2019 - ieeexplore.ieee.org
The performance of Automatic Speech Recognition (ASR) systems is often measured using
Word Error Rates (WER) which requires time-consuming and expensive manually …

Using phoneme representations to build predictive models robust to ASR errors

A Fang, S Filice, N Limsopatham… - Proceedings of the 43rd …, 2020 - dl.acm.org
Even though Automatic Speech Recognition (ASR) systems significantly improved over the
last decade, they still introduce a lot of errors when they transcribe voice to text. One of the …

Learning from past mistakes: improving automatic speech recognition output via noisy-clean phrase context modeling

PG Shivakumar, H Li, K Knight… - APSIPA Transactions on …, 2019 - cambridge.org
Automatic speech recognition (ASR) systems often make unrecoverable errors due to
subsystem pruning (acoustic, language and pronunciation models); for example, pruning …

ASR error detection and recognition rate estimation using deep bidirectional recurrent neural networks

A Ogawa, T Hori - … on Acoustics, Speech and Signal Processing …, 2015 - ieeexplore.ieee.org
Recurrent neural networks (RNNs) have recently been applied as the classifiers for
sequential labeling problems. In this paper, deep bidirectional RNNs (DBRNNs) are applied …

End-to-end speech to intent prediction to improve E-commerce customer support voicebot in Hindi and English

A Goyal, A Singh, N Garera - arXiv preprint arXiv:2211.07710, 2022 - arxiv.org
Automation of on-call customer support relies heavily on accurate and efficient speech-to-
intent (S2I) systems. Building such systems using multi-component pipelines can pose …