ASR n-best fusion nets

R Ma, M Qian, P Manakul, M Gales, K Knill - arXiv preprint arXiv …, 2023 - arxiv.org

ASR error correction continues to serve as an important part of post-processing for speech
recognition systems. Traditionally, these models are trained with supervised training using …

被引用次数：42 相关文章所有 3 个版本

[PDF] arxiv.org

Fastcorrect 2: Fast error correction on multiple candidates for automatic speech recognition

Y Leng, X Tan, R Wang, L Zhu, J Xu, W Liu… - arXiv preprint arXiv …, 2021 - arxiv.org

Error correction is widely used in automatic speech recognition (ASR) to post-process the
generated sentence, and can further reduce the word error rate (WER). Although multiple …

被引用次数：46 相关文章所有 4 个版本

[PDF] arxiv.org

Translation-enhanced multilingual text-to-image generation

Y Li, CY Chang, S Rawls, I Vulić… - arXiv preprint arXiv …, 2023 - arxiv.org

Research on text-to-image generation (TTI) still predominantly focuses on the English
language due to the lack of annotated image-caption data in other languages; in the long …

被引用次数：8 相关文章所有 5 个版本

[PDF] arxiv.org

ASR Error Correction using Large Language Models

R Ma, M Qian, M Gales, K Knill - arXiv preprint arXiv:2409.09554, 2024 - arxiv.org

Error correction (EC) models play a crucial role in refining Automatic Speech Recognition
(ASR) transcriptions, enhancing the readability and quality of transcriptions. Without …

被引用次数：1 相关文章所有 3 个版本

[PDF] arxiv.org

End-to-end speech to intent prediction to improve E-commerce customer support voicebot in Hindi and English

A Goyal, A Singh, N Garera - arXiv preprint arXiv:2211.07710, 2022 - arxiv.org

Automation of on-call customer support relies heavily on accurate and efficient speech-to-
intent (S2I) systems. Building such systems using multi-component pipelines can pose …

被引用次数：6 相关文章所有 5 个版本

[PDF] arxiv.org

Building robust spoken language understanding by cross attention between phoneme sequence and asr hypothesis

Z Wang, Y Le, Y Zhu, Y Zhao, M Feng… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org

Building Spoken Language Understanding (SLU) robust to Automatic Speech Recognition
(ASR) errors is an essential issue for various voice-enabled virtual assistants. Considering …

被引用次数：6 相关文章所有 5 个版本

[PDF] arxiv.org

GEC-RAG: Improving Generative Error Correction via Retrieval-Augmented Generation for Automatic Speech Recognition Systems

A Robatian, M Hajipour, MR Peyghan, F Rajabi… - arXiv preprint arXiv …, 2025 - arxiv.org

Automatic Speech Recognition (ASR) systems have demonstrated remarkable performance
across various applications. However, limited data and the unique language features of …

[PDF] arxiv.org

Recent Progress in Conversational AI

Z Xue, R Li, M Li - arXiv preprint arXiv:2204.09719, 2022 - arxiv.org

Conversational artificial intelligence (AI) is becoming an increasingly popular topic among
industry and academia. With the fast development of neural network-based models, a lot of …

被引用次数：3 相关文章所有 2 个版本

[PDF] arxiv.org

A Token-Wise Beam Search Algorithm for RNN-T

G Keren - 2023 IEEE Automatic Speech Recognition and …, 2023 - ieeexplore.ieee.org

Standard Recurrent Neural Network Transducers (RNN-T) decoding algorithms for speech
recognition are iterating over the time axis, such that one time step is decoded before …

被引用次数：1 相关文章所有 3 个版本

[PDF] github.io

An ASR N-Best Transcript Neural Ranking Model for Spoken Content Retrieval

Y Moriya, GJF Jones - 2021 IEEE Automatic Speech …, 2021 - ieeexplore.ieee.org

Spoken Content Retrieval (SCR) using ASR transcripts is in-creasingly important for
multimedia content archives. How-ever, SCR is often impacted by ASR errors. In recent …

被引用次数：2 相关文章所有 2 个版本

高级搜索

QQ 群

Can generative large language models perform asr error correction?

Fastcorrect 2: Fast error correction on multiple candidates for automatic speech recognition

Translation-enhanced multilingual text-to-image generation

ASR Error Correction using Large Language Models

End-to-end speech to intent prediction to improve E-commerce customer support voicebot in Hindi and English

Building robust spoken language understanding by cross attention between phoneme sequence and asr hypothesis

GEC-RAG: Improving Generative Error Correction via Retrieval-Augmented Generation for Automatic Speech Recognition Systems

Recent Progress in Conversational AI

A Token-Wise Beam Search Algorithm for RNN-T

An ASR N-Best Transcript Neural Ranking Model for Spoken Content Retrieval

引用