Semi-supervised multichannel speech separation based on a phone-and speaker-aware deep generative model of speech spectrograms

Y Du, K Sekiguchi, Y Bando… - 2020 28th European …, 2021 - ieeexplore.ieee.org
This paper describes a semi-supervised multichannel speech separation method that uses
clean speech signals with frame-wise phonetic labels and sample-level speaker labels for …

End-to-end networks for supervised single-channel speech separation

S Venkataramani, P Smaragdis - arXiv preprint arXiv:1810.02568, 2018 - arxiv.org
The performance of single channel source separation algorithms has improved greatly in
recent times with the development and deployment of neural networks. However, many such …

Librimix: An open-source dataset for generalizable speech separation

J Cosentino, M Pariente, S Cornell, A Deleforge… - arXiv preprint arXiv …, 2020 - arxiv.org
In recent years, wsj0-2mix has become the reference dataset for single-channel speech
separation. Most deep learning-based speech separation models today are benchmarked …

End-to-end post-filter for speech separation with deep attention fusion features

C Fan, J Tao, B Liu, J Yi, Z Wen… - IEEE/ACM Transactions …, 2020 - ieeexplore.ieee.org
In this article, we propose an end-to-end post-filter method with deep attention fusion
features for monaural speaker-independent speech separation. At first, a time-frequency …

Single channel speech separation using enhanced learning on embedding features

HM Tan, JC Wang - 2021 IEEE 10th Global Conference on …, 2021 - ieeexplore.ieee.org
Speech separation has been utilized in many important applications such as automatic
speech, mobile phones, hearing aids, and human-machine interactions. In particular, deep …

Monaural speech separation using speaker embedding from preliminary separation

J Byun, JW Shin - IEEE/ACM Transactions on Audio, Speech …, 2021 - ieeexplore.ieee.org
In speech separation, the identities of the speakers may be an important cue to discriminate
speeches in the mixture and separate them better. A few recent researches used the …

[PDF][PDF] Multi-resolution stacking for speech separation based on boosted DNN

XL Zhang, DL Wang - Sixteenth Annual Conference of the …, 2015 - xiaolei-zhang.net
Recent progress in speech separation shows that deep neural networks (DNN) based
supervised methods can improve the performance in difficult noise conditions and exhibit …

Don't shoot butterfly with rifles: Multi-channel continuous speech separation with early exit transformer

S Chen, Y Wu, Z Chen, T Yoshioka… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org
With its strong modeling capacity that comes from a multi-head and multi-layer structure,
Transformer is a very powerful model for learning a sequential representation and has been …

Multi-microphone complex spectral mapping for utterance-wise and continuous speech separation

ZQ Wang, P Wang, DL Wang - IEEE/ACM transactions on …, 2021 - ieeexplore.ieee.org
We propose multi-microphone complex spectral mapping, a simple way of applying deep
learning for time-varying non-linear beamforming, for speaker separation in reverberant …

End-to-end speech separation with unfolded iterative phase reconstruction

ZQ Wang, JL Roux, DL Wang, JR Hershey - arXiv preprint arXiv …, 2018 - arxiv.org
This paper proposes an end-to-end approach for single-channel speaker-independent multi-
speaker speech separation, where time-frequency (TF) masking, the short-time Fourier …