Language identification is critical for many downstream tasks in automatic speech recognition (ASR), and is beneficial to integrate into multilingual end-to-end ASR as an …
Dual-encoder structure successfully utilizes two language-specific encoders (LSEs) for code- switching speech recognition. Because LSEs are initialized by two pre-trained language …
Despite the rapid progress in automatic speech recognition (ASR) research, recognizing multilingual speech using a unified ASR system remains highly challenging. Previous works …
W Wang, G Ma, Y Li, B Du - arXiv preprint arXiv:2307.05956, 2023 - arxiv.org
Multilingual speech recognition for both monolingual and code-switching speech is a challenging task. Recently, based on the Mixture of Experts (MoE), many works have made …
Languages usually switch within a multilingual speech signal, especially in a bilingual society. This phenomenon is referred to as code-switching (CS), making automatic speech …
In this work, we seek to build effective code-switched (CS) automatic speech recognition systems (ASR) under the zero-shot set-ting where no transcribed CS speech data is …
Y Yang, Y Peng, H Huang, ES Chng… - 2024 Asia Pacific …, 2024 - ieeexplore.ieee.org
This paper reports on SOTA results achieved using openAI's Whisper model with adaptation on different adaptation corpus sizes for two established code-switch Mandarin/English …
Y Peng, Y Liu, J Zhang, H Xu, Y He, H Huang… - arXiv preprint arXiv …, 2022 - arxiv.org
Internal Language Model Estimation (ILME) based language model (LM) fusion has been shown significantly improved recognition results over conventional shallow fusion in both …
Z Liang, Z Song, Z Ma, C Du, K Yu, X Chen - arXiv preprint arXiv …, 2023 - arxiv.org
Recently, end-to-end (E2E) automatic speech recognition (ASR) models have made great strides and exhibit excellent performance in general speech recognition. However, there …