[PDF][PDF] Contextual Biasing Speech Recognition in Speech-enhanced Large Language Model

X Gong, A Lv, Z Wang, Y Qian - Proc. Interspeech. ISCA, 2024 - isca-archive.org
Recently, the rapid advancements in audio-and speechenhanced large language models
(SpeechLLMs), such as Qwen-Audio and SALMONN, have significantly propelled automatic …

An efficient text augmentation approach for contextualized Mandarin speech recognition

N Zheng, X Wan, K Liu, Z Du, Z Huan - arXiv preprint arXiv:2406.09950, 2024 - arxiv.org
Although contextualized automatic speech recognition (ASR) systems are commonly used to
improve the recognition of uncommon words, their effectiveness is hindered by the inherent …

Contextualized Automatic Speech Recognition with Dynamic Vocabulary

Y Sudo, Y Fukumoto, M Shakeel, Y Peng… - arXiv preprint arXiv …, 2024 - arxiv.org
Deep biasing (DB) improves the performance of end-to-end automatic speech recognition
(E2E-ASR) for rare words or contextual phrases using a bias list. However, most existing …