Can generative large language models perform asr error correction?

R Ma, M Qian, P Manakul, M Gales, K Knill - arXiv preprint arXiv …, 2023 - arxiv.org
ASR error correction continues to serve as an important part of post-processing for speech
recognition systems. Traditionally, these models are trained with supervised training using …

Fastcorrect 2: Fast error correction on multiple candidates for automatic speech recognition

Y Leng, X Tan, R Wang, L Zhu, J Xu, W Liu… - arXiv preprint arXiv …, 2021 - arxiv.org
Error correction is widely used in automatic speech recognition (ASR) to post-process the
generated sentence, and can further reduce the word error rate (WER). Although multiple …

Translation-enhanced multilingual text-to-image generation

Y Li, CY Chang, S Rawls, I Vulić… - arXiv preprint arXiv …, 2023 - arxiv.org
Research on text-to-image generation (TTI) still predominantly focuses on the English
language due to the lack of annotated image-caption data in other languages; in the long …

ASR Error Correction using Large Language Models

R Ma, M Qian, M Gales, K Knill - arXiv preprint arXiv:2409.09554, 2024 - arxiv.org
Error correction (EC) models play a crucial role in refining Automatic Speech Recognition
(ASR) transcriptions, enhancing the readability and quality of transcriptions. Without …

End-to-end speech to intent prediction to improve E-commerce customer support voicebot in Hindi and English

A Goyal, A Singh, N Garera - arXiv preprint arXiv:2211.07710, 2022 - arxiv.org
Automation of on-call customer support relies heavily on accurate and efficient speech-to-
intent (S2I) systems. Building such systems using multi-component pipelines can pose …

Building robust spoken language understanding by cross attention between phoneme sequence and asr hypothesis

Z Wang, Y Le, Y Zhu, Y Zhao, M Feng… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org
Building Spoken Language Understanding (SLU) robust to Automatic Speech Recognition
(ASR) errors is an essential issue for various voice-enabled virtual assistants. Considering …

GEC-RAG: Improving Generative Error Correction via Retrieval-Augmented Generation for Automatic Speech Recognition Systems

A Robatian, M Hajipour, MR Peyghan, F Rajabi… - arXiv preprint arXiv …, 2025 - arxiv.org
Automatic Speech Recognition (ASR) systems have demonstrated remarkable performance
across various applications. However, limited data and the unique language features of …

Recent Progress in Conversational AI

Z Xue, R Li, M Li - arXiv preprint arXiv:2204.09719, 2022 - arxiv.org
Conversational artificial intelligence (AI) is becoming an increasingly popular topic among
industry and academia. With the fast development of neural network-based models, a lot of …

A Token-Wise Beam Search Algorithm for RNN-T

G Keren - 2023 IEEE Automatic Speech Recognition and …, 2023 - ieeexplore.ieee.org
Standard Recurrent Neural Network Transducers (RNN-T) decoding algorithms for speech
recognition are iterating over the time axis, such that one time step is decoded before …

An ASR N-Best Transcript Neural Ranking Model for Spoken Content Retrieval

Y Moriya, GJF Jones - 2021 IEEE Automatic Speech …, 2021 - ieeexplore.ieee.org
Spoken Content Retrieval (SCR) using ASR transcripts is in-creasingly important for
multimedia content archives. How-ever, SCR is often impacted by ASR errors. In recent …