Delayed Fusion: Integrating Large Language Models into First-Pass Decoding in End-to-end Speech Recognition

T Hori, M Kocour, A Haider, E McDermott… - arXiv preprint arXiv …, 2025 - arxiv.org
This paper presents an efficient decoding approach for end-to-end automatic speech
recognition (E2E-ASR) with large language models (LLMs). Although shallow fusion is the …

Tiny-Align: Bridging Automatic Speech Recognition and Large Language Model on the Edge

R Qin, D Liu, G Xu, Z Yan, C Xu, Y Hu, XS Hu… - arXiv preprint arXiv …, 2024 - arxiv.org
The combination of Large Language Models (LLM) and Automatic Speech Recognition
(ASR), when deployed on edge devices (called edge ASR-LLM), can serve as a powerful …