A light-weight contextual spelling correction model for customizing transducer-based speech...

J Li - APSIPA Transactions on Signal and Information …, 2022 - nowpublishers.com

Recently, the speech community is seeing a significant trend of moving from deep neural
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …

被引用次数：380 相关文章所有 7 个版本

[PDF] arxiv.org

Towards contextual spelling correction for customization of end-to-end speech recognition systems

X Wang, Y Liu, J Li, V Miljanic, S Zhao… - … /ACM Transactions on …, 2022 - ieeexplore.ieee.org

Contextual biasing is an important and challenging task for end-to-end automatic speech
recognition (ASR) systems, which aims to achieve better recognition performance by biasing …

被引用次数：24 相关文章所有 4 个版本

[PDF] arxiv.org

Foundationtts: Text-to-speech for asr customization with generative language model

R Xue, Y Liu, L He, X Tan, L Liu, E Lin… - arXiv preprint arXiv …, 2023 - arxiv.org

Neural text-to-speech (TTS) generally consists of cascaded architecture with separately
optimized acoustic model and vocoder, or end-to-end architecture with continuous mel …

被引用次数：9 相关文章所有 2 个版本

[PDF] arxiv.org

Improving contextual spelling correction by external acoustics attention and semantic aware data augmentation

X Wang, Y Liu, J Li, S Zhao - ICASSP 2023-2023 IEEE …, 2023 - ieeexplore.ieee.org

We previously proposed contextual spelling correction (CSC) to correct the output of end-to-
end (E2E) automatic speech recognition (ASR) models with contextual information such as …

被引用次数：7 相关文章所有 3 个版本

Contextual Spelling Correction with Large Language Models

G Song, Z Wu, G Pundak, A Chandorkar… - 2023 IEEE Automatic …, 2023 - ieeexplore.ieee.org

Contextual Spelling Correction (CSC) models are used to improve automatic speech
recognition (ASR) quality given userspecific context. Typically, context is modeled as a large …

被引用次数：3 相关文章

[PDF] arxiv.org

Spellmapper: A non-autoregressive neural spellchecker for asr customization with candidate retrieval based on n-gram mappings

A Antonova, E Bakhturina, B Ginsburg - arXiv preprint arXiv:2306.02317, 2023 - arxiv.org

Contextual spelling correction models are an alternative to shallow fusion to improve
automatic speech recognition (ASR) quality given user vocabulary. To deal with large user …

被引用次数：6 相关文章所有 6 个版本

[PDF] arxiv.org

Deferred NAM: Low-latency Top-K Context Injection via DeferredContext Encoding for Non-Streaming ASR

Z Wu, G Song, C Li, P Rondon, Z Meng, X Velez… - arXiv preprint arXiv …, 2024 - arxiv.org

Contextual biasing enables speech recognizers to transcribe important phrases in the
speaker's context, such as contact names, even if they are rare in, or absent from, the …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Beyond Hard Samples: Robust and Effective Grammatical Error Correction with Cycle Self-Augmenting

K Feng, Z Tang, J Li, M Zhang - CCF International Conference on Natural …, 2023 - Springer

Recent studies have revealed that grammatical error correction methods in the sequence-to-
sequence paradigm are vulnerable to adversarial attacks. Large Language Models (LLMs) …

被引用次数：1 相关文章所有 4 个版本

[PDF] infocomm-journal.com

基于中文语义-音韵信息的语音识别文本校对模型

仲美玉，吴培良，窦燕，刘毅，孔令富 - 通信学报, 2022 - infocomm-journal.com

为了研究拼音对检测和纠正语音识别文本错误的影响, 提出了一种基于中文语义−
音韵信息的文本校对模型. 定义了5 种拼音编码方法构建字符− 音韵嵌入向量 …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

Have best of both worlds: Two-pass hybrid and E2E cascading framework for speech recognition

G Ye, V Mazalov, J Li, Y Gong - ICASSP 2022-2022 IEEE …, 2022 - ieeexplore.ieee.org

Hybrid and end-to-end (E2E) systems have their individual advantages, with different error
patterns in the speech recognition results. By jointly modeling audio and text, the E2E model …

被引用次数：8 相关文章所有 4 个版本

高级搜索

QQ 群