A consolidated perspective on multimicrophone speech enhancement and source separation

S Gannot, E Vincent… - … /ACM Transactions on …, 2017 - ieeexplore.ieee.org
Speech enhancement and separation are core problems in audio signal processing, with
commercial applications in devices as diverse as mobile phones, conference call systems …

Salsa: A novel dataset for multimodal group behavior analysis

X Alameda-Pineda, J Staiano… - IEEE transactions on …, 2015 - ieeexplore.ieee.org
Studying free-standing conversational groups (FCGs) in unstructured social settings (eg,
cocktail party) is gratifying due to the wealth of information available at the group (mining …

A variational EM algorithm for the separation of time-varying convolutive audio mixtures

D Kounades-Bastian, L Girin… - … on Audio, Speech …, 2016 - ieeexplore.ieee.org
This paper addresses the problem of separating audio sources from time-varying
convolutive mixtures. We propose a probabilistic framework based on the local complex …

Notes on the use of variational autoencoders for speech and audio spectrogram modeling

L Girin, F Roche, T Hueber, S Leglaive - DAFx 2019-22nd International …, 2019 - hal.science
Variational autoencoders (VAEs) are powerful (deep) generative artificial neural networks.
They have been recently used in several papers for speech and audio processing, in …

An expectation-maximization algorithm for multimicrophone speech dereverberation and noise reduction with coherence matrix estimation

O Schwartz, S Gannot… - IEEE/ACM Transactions on …, 2016 - ieeexplore.ieee.org
In speech communication systems, the microphone signals are degraded by reverberation
and ambient noise. The reverberant speech can be separated into two components, namely …

Speech enhancement based on Bayesian low-rank and sparse decomposition of multichannel magnitude spectrograms

Y Bando, K Itoyama, M Konyo… - … on Audio, Speech …, 2017 - ieeexplore.ieee.org
This paper presents a blind multichannel speech enhancement method that can deal with
the time-varying layout of microphones and sound sources. Since nonnegative tensor …

Two model-based EM algorithms for blind source separation in noisy environments

B Schwartz, S Gannot… - IEEE/ACM Transactions on …, 2017 - ieeexplore.ieee.org
The problem of blind separation of speech signals in the presence of noise using multiple
microphones is addressed. Blind estimation of the acoustic parameters and the individual …

Low latency and high quality two-stage human-voice-enhancement system for a hose-shaped rescue robot

Y Bando, H Saruwatari, N Ono, S Makino… - Journal of Robotics …, 2017 - jstage.jst.go.jp
This paper presents the design and implementation of a two-stage human-voice
enhancement system for a hose-shaped rescue robot. When a microphoneequipped hose …

Audio source separation into the wild

L Girin, S Gannot, X Li - Multimodal Behavior Analysis in the Wild, 2019 - Elsevier
This review chapter is dedicated to multichannel audio source separation in a real-life
environment. We explore some of the major achievements in the field and discuss some of …

An inverse-gamma source variance prior with factorized parameterization for audio source separation

D Kounades-Bastian, L Girin… - … , Speech and Signal …, 2016 - ieeexplore.ieee.org
In this paper we present a new statistical model for the power spectral density (PSD) of an
audio signal and its application to multichannel audio source separation (MASS). The …