[PDF][PDF] Recent advances in end-to-end automatic speech recognition

J Li - APSIPA Transactions on Signal and Information …, 2022 - nowpublishers.com
Recently, the speech community is seeing a significant trend of moving from deep neural
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …

On addressing practical challenges for rnn-transducer

R Zhao, J Xue, J Li, W Wei, L He… - 2021 IEEE Automatic …, 2021 - ieeexplore.ieee.org
In this paper, several works are proposed to address practi-cal challenges for deploying
RNN Transducer (RNN-T) based speech recognition systems. These challenges are …

Unsupervised uncertainty measures of automatic speech recognition for non-intrusive speech intelligibility prediction

Z Tu, N Ma, J Barker - arXiv preprint arXiv:2204.04288, 2022 - arxiv.org
Non-intrusive intelligibility prediction is important for its application in realistic scenarios,
where a clean reference signal is difficult to access. The construction of many non-intrusive …

Asr rescoring and confidence estimation with electra

H Futami, H Inaguma, M Mimura… - 2021 IEEE Automatic …, 2021 - ieeexplore.ieee.org
In automatic speech recognition (ASR) rescoring, the hypothesis with the fewest errors
should be selected from the n-best list using a language model (LM). However, LMs are …

UbiComb: a hybrid deep learning model for predicting plant-specific protein ubiquitylation sites

A Siraj, DY Lim, H Tayara, KT Chong - Genes, 2021 - mdpi.com
Protein ubiquitylation is an essential post-translational modification process that performs a
critical role in a wide range of biological functions, even a degenerative role in certain …

[HTML][HTML] Understanding disrupted sentences using underspecified abstract meaning representation

A Addlesee, M Damonte - 2023 - amazon.science
Voice assistant accessibility is generally overlooked as today's spoken dialogue systems are
trained on huge corpora to help them understand the 'average'user. This raises frustrating …

DIANA, a process-oriented model of human auditory word recognition

L Ten Bosch, L Boves, M Ernestus - Brain Sciences, 2022 - mdpi.com
This article presents diana, a new, process-oriented model of human auditory word
recognition, which takes as its input the acoustic signal and can produce as its output word …

Residual energy-based models for end-to-end speech recognition

Q Li, Y Zhang, B Li, L Cao, PC Woodland - arXiv preprint arXiv …, 2021 - arxiv.org
End-to-end models with auto-regressive decoders have shown impressive results for
automatic speech recognition (ASR). These models formulate the sequence-level probability …

Fast entropy-based methods of word-level confidence estimation for end-to-end automatic speech recognition

A Laptev, B Ginsburg - 2022 IEEE Spoken Language …, 2023 - ieeexplore.ieee.org
This paper presents a class of new fast non-trainable entropy-based confidence estimation
methods for automatic speech recognition. We show how per-frame entropy values can be …

Cross-modal asr post-processing system for error correction and utterance rejection

J Du, S Pu, Q Dong, C Jin, X Qi, D Gu, R Wu… - arXiv preprint arXiv …, 2022 - arxiv.org
Although modern automatic speech recognition (ASR) systems can achieve high
performance, they may produce errors that weaken readers' experience and do harm to …