Joint modeling of code-switched and monolingual asr via conditional factorization

B Yan, C Zhang, M Yu, SX Zhang… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org
ICASSP 2022-2022 IEEE International Conference on Acoustics …, 2022ieeexplore.ieee.org
Conversational bilingual speech encompasses three types of utterances: two purely
monolingual types and one intra-sententially code-switched type. In this work, we propose a
general framework to jointly model the likelihoods of the monolingual and code-switch sub-
tasks that comprise bilingual speech recognition. By defining the monolingual sub-tasks with
label-to-frame synchronization, our joint modeling framework can be conditionally factorized
such that the final bilingual output, which may or may not be code-switched, is obtained …
Conversational bilingual speech encompasses three types of utterances: two purely monolingual types and one intra-sententially code-switched type. In this work, we propose a general framework to jointly model the likelihoods of the monolingual and code-switch sub-tasks that comprise bilingual speech recognition. By defining the monolingual sub-tasks with label-to-frame synchronization, our joint modeling framework can be conditionally factorized such that the final bilingual output, which may or may not be code-switched, is obtained given only monolingual information. We show that this conditionally factorized joint framework can be modeled by an end-to-end differentiable neural network. We demonstrate the efficacy of our proposed model on bilingual Mandarin-English speech recognition across both monolingual and code-switched corpora.
ieeexplore.ieee.org
以上显示的是最相近的搜索结果。 查看全部搜索结果