Supervised speech separation based on deep learning: An overview

DL Wang, J Chen - IEEE/ACM transactions on audio, speech …, 2018 - ieeexplore.ieee.org
Speech separation is the task of separating target speech from background interference.
Traditionally, speech separation is studied as a signal processing problem. A more recent …

An overview of noise-robust automatic speech recognition

J Li, L Deng, Y Gong… - IEEE/ACM Transactions …, 2014 - ieeexplore.ieee.org
New waves of consumer-centric applications, such as voice search and voice interaction
with mobile devices and home entertainment systems, increasingly require automatic …

DDSP: Differentiable digital signal processing

J Engel, L Hantrakul, C Gu, A Roberts - arXiv preprint arXiv:2001.04643, 2020 - arxiv.org
Most generative models of audio directly generate samples in one of two domains: time or
frequency. While sufficient to express any signal, these representations are inefficient, as …

[HTML][HTML] Machine learning in acoustics: Theory and applications

MJ Bianco, P Gerstoft, J Traer, E Ozanich… - The Journal of the …, 2019 - pubs.aip.org
Acoustic data provide scientific and engineering insights in fields ranging from biology and
communications to ocean and Earth science. We survey the recent advances and …

A consolidated perspective on multimicrophone speech enhancement and source separation

S Gannot, E Vincent… - … /ACM Transactions on …, 2017 - ieeexplore.ieee.org
Speech enhancement and separation are core problems in audio signal processing, with
commercial applications in devices as diverse as mobile phones, conference call systems …

A summary of the REVERB challenge: state-of-the-art and remaining challenges in reverberant speech processing research

K Kinoshita, M Delcroix, S Gannot, EA P. Habets… - EURASIP Journal on …, 2016 - Springer
In recent years, substantial progress has been made in the field of reverberant speech
signal processing, including both single-and multichannel dereverberation techniques and …

HiFi-GAN: High-fidelity denoising and dereverberation based on speech deep features in adversarial networks

J Su, Z Jin, A Finkelstein - arXiv preprint arXiv:2006.05694, 2020 - arxiv.org
Real-world audio recordings are often degraded by factors such as noise, reverberation,
and equalization distortion. This paper introduces HiFi-GAN, a deep learning method to …

Storm: A diffusion-based stochastic regeneration model for speech enhancement and dereverberation

JM Lemercier, J Richter, S Welker… - … /ACM Transactions on …, 2023 - ieeexplore.ieee.org
Diffusion models have shown a great ability at bridging the performance gap between
predictive and generative approaches for speech enhancement. We have shown that they …

Far-field automatic speech recognition

R Haeb-Umbach, J Heymann, L Drude… - Proceedings of the …, 2020 - ieeexplore.ieee.org
The machine recognition of speech spoken at a distance from the microphones, known as
far-field automatic speech recognition (ASR), has received a significant increase in attention …

The REVERB challenge: A common evaluation framework for dereverberation and recognition of reverberant speech

K Kinoshita, M Delcroix, T Yoshioka… - … IEEE Workshop on …, 2013 - ieeexplore.ieee.org
Recently, substantial progress has been made in the field of reverberant speech signal
processing, including both single-and multichannel dereverberation techniques, and …