Deep learning for environmentally robust speech recognition: An overview of recent developments

Z Zhang, J Geiger, J Pohjalainen, AED Mousa… - ACM Transactions on …, 2018 - dl.acm.org
Eliminating the negative effect of non-stationary environmental noise is a long-standing
research topic for automatic speech recognition but still remains an important challenge …

An overview of noise-robust automatic speech recognition

J Li, L Deng, Y Gong… - IEEE/ACM Transactions …, 2014 - ieeexplore.ieee.org
New waves of consumer-centric applications, such as voice search and voice interaction
with mobile devices and home entertainment systems, increasingly require automatic …

CHiME-6 challenge: Tackling multispeaker speech recognition for unsegmented recordings

S Watanabe, M Mandel, J Barker, E Vincent… - arXiv preprint arXiv …, 2020 - arxiv.org
Following the success of the 1st, 2nd, 3rd, 4th and 5th CHiME challenges we organize the
6th CHiME Speech Separation and Recognition Challenge (CHiME-6). The new challenge …

The fifth'CHiME'speech separation and recognition challenge: dataset, task and baselines

J Barker, S Watanabe, E Vincent, J Trmal - arXiv preprint arXiv …, 2018 - arxiv.org
The CHiME challenge series aims to advance robust automatic speech recognition (ASR)
technology by promoting research at the interface of speech and language processing …

A consolidated perspective on multimicrophone speech enhancement and source separation

S Gannot, E Vincent… - … /ACM Transactions on …, 2017 - ieeexplore.ieee.org
Speech enhancement and separation are core problems in audio signal processing, with
commercial applications in devices as diverse as mobile phones, conference call systems …

An analysis of environment, microphone and data simulation mismatches in robust speech recognition

E Vincent, S Watanabe, AA Nugraha, J Barker… - Computer Speech & …, 2017 - Elsevier
Speech enhancement and automatic speech recognition (ASR) are most often evaluated in
matched (or multi-condition) settings where the acoustic conditions of the training data …

Ideal ratio mask estimation using deep neural networks for robust speech recognition

A Narayanan, DL Wang - 2013 IEEE international conference …, 2013 - ieeexplore.ieee.org
We propose a feature enhancement algorithm to improve robust automatic speech
recognition (ASR). The algorithm estimates a smoothed ideal ratio mask (IRM) in the Mel …

Multichannel audio source separation with deep neural networks

AA Nugraha, A Liutkus, E Vincent - IEEE/ACM Transactions on …, 2016 - ieeexplore.ieee.org
This article addresses the problem of multichannel audio source separation. We propose a
framework where deep neural networks (DNNs) are used to model the source spectra and …

The first multimodal information based speech processing (misp) challenge: Data, tasks, baselines and results

H Chen, H Zhou, J Du, CH Lee, J Chen… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org
In this paper we discuss the rational of the Multi-model Information based Speech
Processing (MISP) Challenge, and provide a detailed description of the data recorded, the …

An unsupervised deep domain adaptation approach for robust speech recognition

S Sun, B Zhang, L Xie, Y Zhang - Neurocomputing, 2017 - Elsevier
This paper addresses the robust speech recognition problem as a domain adaptation task.
Specifically, we introduce an unsupervised deep domain adaptation (DDA) approach to …