Speech separation algorithms are often used to separate the target speech from other interfering sources. However, purely neural network based speech separation systems often …
R Gu, SX Zhang, Y Zou, D Yu - IEEE/ACM Transactions on …, 2022 - ieeexplore.ieee.org
Recently, frequency domain all-neural beamforming methods have achieved remarkable progress for multichannel speech separation. In parallel, the integration of time domain …
ZQ Wang, S Watanabe - Advances in Neural Information …, 2024 - proceedings.neurips.cc
In reverberant conditions with multiple concurrent speakers, each microphone acquires a mixture signal of multiple speakers at a different location. In over-determined conditions …
R Gu, SX Zhang, Y Zou, D Yu - IEEE Signal Processing Letters, 2021 - ieeexplore.ieee.org
To date, mainstream target speech separation (TSS) approaches are formulated to estimate the complex ratio mask (cRM) of target speech in time-frequency domain under supervised …
Accurate recognition of cocktail party speech containing overlapping speakers, noise and reverberation remains a highly challenging task to date. Motivated by the invariance of …
Far-field multi-speaker automatic speech recognition (ASR) has drawn increasing attention in recent years. Most existing methods feature a signal processing frontend and an ASR …
Although the conventional mask-based minimum variance distortionless response (MVDR) could reduce the non-linear distortion, the residual noise level of the MVDR separated …
Many purely neural network based speech separation approaches have been proposed to improve objective assessment scores, but they often introduce nonlinear distortions that are …
We present a single-stage casual waveform-to-waveform multichannel model that can separate moving sound sources based on their broad spatial locations in a dynamic …