Attention is all you need in speech separation

C Subakan, M Ravanelli, S Cornell… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org
Recurrent Neural Networks (RNNs) have long been the dominant architecture in sequence-
to-sequence learning. RNNs, however, are inherently sequential models that do not allow …

[HTML][HTML] Deep neural network techniques for monaural speech enhancement and separation: state of the art analysis

P Ochieng - Artificial Intelligence Review, 2023 - Springer
Deep neural networks (DNN) techniques have become pervasive in domains such as
natural language processing and computer vision. They have achieved great success in …

Two heads are better than one: A two-stage complex spectral mapping approach for monaural speech enhancement

A Li, W Liu, C Zheng, C Fan, X Li - IEEE/ACM Transactions on …, 2021 - ieeexplore.ieee.org
For challenging acoustic scenarios as low signal-to-noise ratios, current speech
enhancement systems usually suffer from performance bottleneck in extracting the target …

Fullsubnet+: Channel attention fullsubnet with complex spectrograms for speech enhancement

J Chen, Z Wang, D Tuo, Z Wu, S Kang… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org
Previously proposed FullSubNet has achieved outstanding performance in Deep Noise
Suppression (DNS) Challenge and attracted much attention. However, it still encounters …

Sudo rm-rf: Efficient networks for universal audio source separation

E Tzinis, Z Wang, P Smaragdis - 2020 IEEE 30th International …, 2020 - ieeexplore.ieee.org
In this paper, we present an efficient neural network for end-to-end general purpose audio
source separation. Specifically, the backbone structure of this convolutional network is the …

Asteroid: the PyTorch-based audio source separation toolkit for researchers

M Pariente, S Cornell, J Cosentino… - arXiv preprint arXiv …, 2020 - arxiv.org
This paper describes Asteroid, the PyTorch-based audio source separation toolkit for
researchers. Inspired by the most successful neural source separation systems, it provides …

What's all the fuss about free universal sound separation data?

S Wisdom, H Erdogan, DPW Ellis… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org
We introduce the Free Universal Sound Separation (FUSS) dataset, a new corpus for
experiments in separating mixtures of an unknown number of sounds from an open domain …

Remixit: Continual self-training of speech enhancement models via bootstrapped remixing

E Tzinis, Y Adi, VK Ithapu, B Xu… - IEEE Journal of …, 2022 - ieeexplore.ieee.org
We present RemixIT, a simple yet effective self-supervised method for training speech
enhancement without the need of a single isolated in-domain speech nor a noise waveform …

Speech separation using an asynchronous fully recurrent convolutional neural network

X Hu, K Li, W Zhang, Y Luo… - Advances in …, 2021 - proceedings.neurips.cc
Recent advances in the design of neural network architectures, in particular those
specialized in modeling sequences, have provided significant improvements in speech …

Audioscopev2: Audio-visual attention architectures for calibrated open-domain on-screen sound separation

E Tzinis, S Wisdom, T Remez, JR Hershey - European Conference on …, 2022 - Springer
We introduce AudioScopeV2, a state-of-the-art universal audio-visual on-screen sound
separation system which is capable of learning to separate sounds and associate them with …