[PDF][PDF] Listen, Think and Listen Again: Capturing Top-down Auditory Attention for Speaker-independent Speech Separation.

J Shi, J Xu, G Liu, B Xu - IJCAI, 2018 - ijcai.org
Recent deep learning methods have made significant progress in multi-talker mixed speech
separation. However, most existing models adopt a driftless strategy to separate all the …

Beam-guided TasNet: An iterative speech separation framework with multi-channel output

H Chen, Y Yi, D Feng, P Zhang - arXiv preprint arXiv:2102.02998, 2021 - arxiv.org
Time-domain audio separation network (TasNet) has achieved remarkable performance in
blind source separation (BSS). Classic multi-channel speech processing framework …

3d spatial features for multi-channel target speech separation

R Gu, SX Zhang, M Yu, D Yu - 2021 IEEE Automatic Speech …, 2021 - ieeexplore.ieee.org
The use of speaker's directional information for speech sepa-ration and speech recognition
has demonstrated the state-of-the-art performances on multi-talker scenarios. One major …

Mixture to Mixture: Leveraging Close-talk Mixtures as Weak-supervision for Speech Separation

ZQ Wang - arXiv preprint arXiv:2402.09313, 2024 - arxiv.org
We propose mixture to mixture (M2M) training, a weakly-supervised neural speech
separation algorithm that leverages close-talk mixtures as a weak supervision for training …

Gated residual networks with dilated convolutions for supervised speech separation

K Tan, J Chen, DL Wang - 2018 IEEE International Conference …, 2018 - ieeexplore.ieee.org
In supervised speech separation, deep neural networks (DNNs) are typically employed to
predict an ideal time-frequency (TF) mask in order to remove background interference …

A gender mixture detection approach to unsupervised single-channel speech separation based on deep neural networks

Y Wang, J Du, LR Dai, CH Lee - IEEE/ACM Transactions on …, 2017 - ieeexplore.ieee.org
We propose an unsupervised speech separation framework for mixtures of two unseen
speakers in a single-channel setting based on deep neural networks (DNNs). We rely on a …

Heterogeneous separation consistency training for adaptation of unsupervised speech separation

J Han, Y Long - EURASIP Journal on Audio, Speech, and Music …, 2023 - Springer
Recently, supervised speech separation has made great progress. However, limited by the
nature of supervised training, most existing separation methods require ground-truth …

A target speaker separation neural network with joint-training

W Yang, J Wang, H Li, N Xu, F Xiang… - 2021 Asia-Pacific …, 2021 - ieeexplore.ieee.org
Target speaker separation aims to separate a target speech from multiple interference
voices, which is promising for solving conventional difficulties in speech separation, such as …

MIMO-DBnet: Multi-channel input and multiple outputs DOA-aware beamforming network for speech separation

Y Fu, H Yin, M Ge, L Wang, G Zhang, J Dang… - arXiv preprint arXiv …, 2022 - arxiv.org
Recently, many deep learning based beamformers have been proposed for multi-channel
speech separation. Nevertheless, most of them rely on extra cues known in advance, such …

[PDF][PDF] Multi-Stream Gated and Pyramidal Temporal Convolutional Neural Networks for Audio-Visual Speech Separation in Multi-Talker Environments.

Y Luo, J Wang, L Xu, L Yang - Interspeech, 2021 - researchgate.net
Speech separation is the task of extracting target speech from noisy mixture. In applications
like video telephones or video conferencing, lip movements of the target speaker are …