Contextualized End-to-end Automatic Speech Recognition with Intermediate Biasing Loss

M Shakeel, Y Sudo, Y Peng, S Watanabe - arXiv preprint arXiv …, 2024 - arxiv.org
Contextualized end-to-end automatic speech recognition has been an active research area,
with recent efforts focusing on the implicit learning of contextual phrases based on the final …

Optimizing Contextual Speech Recognition Using Vector Quantization for Efficient Retrieval

N Flemotomos, R Hsiao, P Swietojanski, T Hori… - arXiv preprint arXiv …, 2024 - arxiv.org
Neural contextual biasing allows speech recognition models to leverage contextually
relevant information, leading to improved transcription accuracy. However, the biasing …

InterBiasing: Boost Unseen Word Recognition through Biasing Intermediate Predictions

Y Nakagome, M Hentschel - arXiv preprint arXiv:2406.14890, 2024 - arxiv.org
Despite recent advances in end-to-end speech recognition methods, their output is biased to
the training data's vocabulary, resulting in inaccurate recognition of unknown terms or …

CTC-Assisted LLM-Based Contextual ASR

G Yang, Z Ma, Z Gao, S Zhang, X Chen - arXiv preprint arXiv:2411.06437, 2024 - arxiv.org
Contextual ASR or hotword customization holds substantial practical value. Despite the
impressive performance of current end-to-end (E2E) automatic speech recognition (ASR) …

Contextualized Automatic Speech Recognition with Dynamic Vocabulary

Y Sudo, Y Fukumoto, M Shakeel, Y Peng… - arXiv preprint arXiv …, 2024 - arxiv.org
Deep biasing (DB) improves the performance of end-to-end automatic speech recognition
(E2E-ASR) for rare words or contextual phrases using a bias list. However, most existing …