End-to-end speech recognition contextualization with large language models

E Lakomkin, C Wu, Y Fathullah, O Kalinli… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org
In recent years, Large Language Models (LLMs) have garnered significant attention from the
research community due to their exceptional performance and generalization capabilities. In …

Contextualized end-to-end automatic speech recognition with intermediate biasing loss

M Shakeel, Y Sudo, Y Peng, S Watanabe - arXiv preprint arXiv …, 2024 - arxiv.org
Contextualized end-to-end automatic speech recognition has been an active research area,
with recent efforts focusing on the implicit learning of contextual phrases based on the final …

Spike-Triggered Contextual Biasing for End-to-End Mandarin Speech Recognition

K Huang, A Zhang, B Zhang, T Xu… - 2023 IEEE Automatic …, 2023 - ieeexplore.ieee.org
The attention-based deep contextual biasing method has been demonstrated to effectively
improve the recognition performance of end-to-end automatic speech recognition (ASR) …

Deferred NAM: Low-latency Top-K Context Injection via DeferredContext Encoding for Non-Streaming ASR

Z Wu, G Song, C Li, P Rondon, Z Meng, X Velez… - arXiv preprint arXiv …, 2024 - arxiv.org
Contextual biasing enables speech recognizers to transcribe important phrases in the
speaker's context, such as contact names, even if they are rare in, or absent from, the …

Contextual biasing with the Knuth-Morris-Pratt matching algorithm

W Wang, Z Wu, D Caseiro, T Munkhdalai… - arXiv preprint arXiv …, 2023 - arxiv.org
Contextual biasing refers to the problem of biasing the automatic speech recognition (ASR)
systems towards rare entities that are relevant to the specific user or application scenarios …

Improving ASR Contextual Biasing with Guided Attention

J Tang, K Kim, S Shon, F Wu… - ICASSP 2024-2024 IEEE …, 2024 - ieeexplore.ieee.org
In this paper, we propose a Guided Attention (GA) auxiliary training loss, which improves the
effectiveness and robustness of automatic speech recognition (ASR) contextual biasing …

[PDF][PDF] Contextual Biasing with Confidence-based Homophone Detector for Mandarin End-to-End Speech Recognition

C Yang, L Zheng, S Tian, G Cheng, S Xiao… - Proc. Interspeech …, 2024 - isca-archive.org
Deep biasing methods and shallow fusion methods have been demonstrated to improve the
performance of end-to-end ASR effectively. However, accurate recognition often becomes …

An efficient text augmentation approach for contextualized Mandarin speech recognition

N Zheng, X Wan, K Liu, Z Du, Z Huan - arXiv preprint arXiv:2406.09950, 2024 - arxiv.org
Although contextualized automatic speech recognition (ASR) systems are commonly used to
improve the recognition of uncommon words, their effectiveness is hindered by the inherent …

Hierarchical attention-based contextual biasing for personalized speech recognition using neural transducers

S Tong, P Harding, S Wiesler - 2023 IEEE Automatic Speech …, 2023 - ieeexplore.ieee.org
Although end-to-end (E2E) automatic speech recognition (ASR) systems excel in general
tasks, they frequently struggle with accurately recognizing personal rare words. Leveraging …

XCB: an effective contextual biasing approach to bias cross-lingual phrases in speech recognition

X Wan, N Zheng, K Liu, H Zhou - arXiv preprint arXiv:2408.10524, 2024 - arxiv.org
Contextualized ASR models have been demonstrated to effectively improve the recognition
accuracy of uncommon phrases when a predefined phrase list is available. However, these …