A Mostly Data-Driven Approach to Inverse Text Normalization.

Funaudiollm: Voice understanding and generation foundation models for natural interaction between humans and llms

K An, Q Chen, C Deng, Z Du, C Gao, Z Gao… - arXiv preprint arXiv …, 2024 - arxiv.org

This report introduces FunAudioLLM, a model family designed to enhance natural voice
interactions between humans and large language models (LLMs). At its core are two …

被引用次数：24 相关文章所有 5 个版本

[PDF] mit.edu

Neural models of text normalization for speech applications

H Zhang, R Sproat, AH Ng, F Stahlberg… - Computational …, 2019 - direct.mit.edu

Abstract Machine learning, including neural network techniques, have been applied to
virtually every domain in natural language processing. One problem that has been …

被引用次数：123 相关文章所有 4 个版本

[PDF] arxiv.org

Spgispeech: 5,000 hours of transcribed financial audio for fully formatted end-to-end speech recognition

PK O'Neill, V Lavrukhin, S Majumdar, V Noroozi… - arXiv preprint arXiv …, 2021 - arxiv.org

In the English speech-to-text (STT) machine learning task, acoustic models are
conventionally trained on uncased Latin characters, and any necessary orthography (such …

被引用次数：47 相关文章所有 8 个版本

[PDF] arxiv.org

Neural inverse text normalization

M Sunkara, C Shivade, S Bodapati… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org

While there have been several contributions exploring state of the art techniques for text
normalization, the problem of inverse text normalization (ITN) remains relatively unexplored …

被引用次数：42 相关文章所有 4 个版本

[PDF] aclanthology.org

Neural text normalization with subword units

C Mansfield, M Sun, Y Liu, A Gandhe… - Proceedings of the …, 2019 - aclanthology.org

Text normalization (TN) is an important step in conversational systems. It converts written
text to its spoken form to facilitate speech recognition, natural language understanding and …

被引用次数：56 相关文章所有 7 个版本

[PDF] arxiv.org

Nemo inverse text normalization: From development to production

Y Zhang, E Bakhturina, K Gorman… - arXiv preprint arXiv …, 2021 - arxiv.org

Inverse text normalization (ITN) converts spoken-domain automatic speech recognition
(ASR) output into written-domain text to improve the readability of the ASR output. Many …

被引用次数：33 相关文章所有 7 个版本

[PDF] arxiv.org

Text normalization using memory augmented neural networks

S Pramanik, A Hussain - Speech Communication, 2019 - Elsevier

We perform text normalization, ie the transformation of words from the written to the spoken
form, using a memory augmented neural network. With the addition of dynamic memory …

被引用次数：42 相关文章所有 7 个版本

[PDF] arxiv.org

Streaming, fast and accurate on-device inverse text normalization for automatic speech recognition

Y Gaur, N Kibre, J Xue, K Shu, Y Wang… - 2022 IEEE Spoken …, 2023 - ieeexplore.ieee.org

Automatic Speech Recognition (ASR) systems typically yield output in lexical form. However,
humans prefer a written form output. To bridge this gap, ASR systems usually employ …

被引用次数：7 相关文章所有 3 个版本

[PDF] isca-archive.org

[PDF][PDF] Transcribing speech as spoken and written dual text using an autoregressive model

M Ihori, H Sato, T Tanaka, R Masumura… - Proc. INTERSPEECH …, 2023 - isca-archive.org

This paper proposes a novel method to jointly generate spoken and written text from input
speech for expanding use cases of speech-based applications. The spoken text generated …

被引用次数：3 相关文章所有 2 个版本

Multi Transcription-Style Speech Transcription Using Attention-Based Encoder-Decoder Model

Y Huang, P Behre, G Ye, S Chang… - 2023 IEEE Automatic …, 2023 - ieeexplore.ieee.org

Human professional transcription services provide a variety of transcription styles to
customize different needs. To accommodate different users and facilitate seamless …

被引用次数：1 相关文章

高级搜索

QQ 群