A recurrent variational autoencoder for speech enhancement

S Leglaive, X Alameda-Pineda, L Girin… - ICASSP 2020-2020 …, 2020 - ieeexplore.ieee.org
This paper presents a generative approach to speech enhancement based on a recurrent
variational autoencoder (RVAE). The deep generative speech model is trained using clean …

Unsupervised speech enhancement using dynamical variational autoencoders

X Bie, S Leglaive, X Alameda-Pineda… - IEEE/ACM Transactions …, 2022 - ieeexplore.ieee.org
Dynamical variational autoencoders (DVAEs) are a class of deep generative models with
latent variables, dedicated to model time series of high-dimensional data. DVAEs can be …

[HTML][HTML] A generic formula and some special cases for the Kullback–Leibler divergence between central multivariate Cauchy distributions

N Bouhlel, D Rousseau - Entropy, 2022 - mdpi.com
This paper introduces a closed-form expression for the Kullback–Leibler divergence (KLD)
between two central multivariate Cauchy distributions (MCDs) which have been recently …

Generalized Fast Multichannel Nonnegative Matrix Factorization Based on Gaussian Scale Mixtures for Blind Source Separation

M Fontaine, K Sekiguchi, AA Nugraha… - … on Audio, Speech …, 2022 - ieeexplore.ieee.org
This paper describes heavy-tailed extensions of a state-of-the-art versatile blind source
separation method called fast multichannel nonnegative matrix factorization (FastMNMF) …

[PDF][PDF] Alpha-Stable Autoregressive Fast Multichannel Nonnegative Matrix Factorization for Joint Speech Enhancement and Dereverberation.

M Fontaine, K Sekiguchi, AA Nugraha, Y Bando… - …, 2021 - sap.ist.i.kyoto-u.ac.jp
This paper proposes α-stable autoregressive fast multichannel nonnegative matrix
factorization (α-AR-FastMNMF), a robust joint blind speech enhancement and …

[PDF][PDF] Unsupervised Robust Speech Enhancement Based on Alpha-Stable Fast Multichannel Nonnegative Matrix Factorization.

M Fontaine, K Sekiguchi, AA Nugraha… - …, 2020 - sap.ist.i.kyoto-u.ac.jp
This paper describes multichannel speech enhancement based on a probabilistic model of
complex source spectrograms for improving the intelligibility of speech corrupted by …