VarArray meets t-SOT: Advancing the state of the art of streaming distant conversational speech recognition

N Kanda, J Wu, X Wang, Z Chen, J Li… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
This paper presents a novel streaming automatic speech recognition (ASR) framework for
multi-talker overlapping speech captured by a distant microphone array with an arbitrary …

Summary on the ICASSP 2022 multi-channel multi-party meeting transcription grand challenge

F Yu, S Zhang, P Guo, Y Fu, Z Du… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org
The ICASSP 2022 Multi-channel Multi-party Meeting Transcription Grand Challenge
(M2MeT) focuses on one of the most valuable and the most challenging scenarios of speech …

MFCCA: Multi-Frame Cross-Channel attention for multi-speaker ASR in Multi-party meeting scenario

F Yu, S Zhang, P Guo, Y Liang, Z Du… - 2022 IEEE Spoken …, 2023 - ieeexplore.ieee.org
Recently cross-channel attention, which better leverages multi-channel signals from
microphone array, has shown promising results in the multi-party meeting scenario. Cross …

Spatialemb: Extract and Encode Spatial Information for 1-Stage Multi-Channel Multi-Speaker ASR on Arbitrary Microphone Arrays

Y Shao, Y Xu, S Khudanpur… - 2024 IEEE Spoken …, 2024 - ieeexplore.ieee.org
Spatial information is a critical clue for multi-channel multispeaker target speech recognition.
Most state-of-the-art multi-channel Automatic Speech Recognition (ASR) systems extract …

Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC

J Kang, L Meng, M Cui, Y Wang, X Wu, X Liu… - arXiv preprint arXiv …, 2024 - arxiv.org
Multi-talker speech recognition (MTASR) faces unique challenges in disentangling and
transcribing overlapping speech. To address these challenges, this paper investigates the …