Supervised speech separation based on deep learning: An overview

DL Wang, J Chen - IEEE/ACM transactions on audio, speech …, 2018 - ieeexplore.ieee.org
Speech separation is the task of separating target speech from background interference.
Traditionally, speech separation is studied as a signal processing problem. A more recent …

A review of blind source separation methods: two converging routes to ILRMA originating from ICA and NMF

H Sawada, N Ono, H Kameoka, D Kitamura… - … Transactions on Signal …, 2019 - cambridge.org
This paper describes several important methods for the blind source separation of audio
signals in an integrated manner. Two historically developed routes are featured. One started …

[HTML][HTML] A survey of sound source localization with deep learning methods

PA Grumiaux, S Kitić, L Girin, A Guérin - The Journal of the Acoustical …, 2022 - pubs.aip.org
This article is a survey of deep learning methods for single and multiple sound source
localization, with a focus on sound source localization in indoor environments, where …

Deep learning for audio signal processing

H Purwins, B Li, T Virtanen, J Schlüter… - IEEE Journal of …, 2019 - ieeexplore.ieee.org
Given the recent surge in developments of deep learning, this paper provides a review of the
state-of-the-art deep learning techniques for audio signal processing. Speech, music, and …

[HTML][HTML] Machine learning in acoustics: Theory and applications

MJ Bianco, P Gerstoft, J Traer, E Ozanich… - The Journal of the …, 2019 - pubs.aip.org
Acoustic data provide scientific and engineering insights in fields ranging from biology and
communications to ocean and Earth science. We survey the recent advances and …

Wave-u-net: A multi-scale neural network for end-to-end audio source separation

D Stoller, S Ewert, S Dixon - arXiv preprint arXiv:1806.03185, 2018 - arxiv.org
Models for audio source separation usually operate on the magnitude spectrum, which
ignores phase information and makes separation performance dependant on hyper …

A consolidated perspective on multimicrophone speech enhancement and source separation

S Gannot, E Vincent… - … /ACM Transactions on …, 2017 - ieeexplore.ieee.org
Speech enhancement and separation are core problems in audio signal processing, with
commercial applications in devices as diverse as mobile phones, conference call systems …

An analysis of environment, microphone and data simulation mismatches in robust speech recognition

E Vincent, S Watanabe, AA Nugraha, J Barker… - Computer Speech & …, 2017 - Elsevier
Speech enhancement and automatic speech recognition (ASR) are most often evaluated in
matched (or multi-condition) settings where the acoustic conditions of the training data …

Self-supervised moving vehicle tracking with stereo sound

C Gan, H Zhao, P Chen, D Cox… - Proceedings of the …, 2019 - openaccess.thecvf.com
Humans are able to localize objects in the environment using both visual and auditory cues,
integrating information from multiple modalities into a common reference frame. We …

Improved speech enhancement with the wave-u-net

C Macartney, T Weyde - arXiv preprint arXiv:1811.11307, 2018 - arxiv.org
We study the use of the Wave-U-Net architecture for speech enhancement, a model
introduced by Stoller et al for the separation of music vocals and accompaniment. This end …