查看文章

ieee.org 中的 [PDF]

Semi-Supervised Multichannel Speech Enhancement With a Deep Speech Prior

作者

Kouhei Sekiguchi, Yoshiaki Bando, Aditya Arie Nugraha, Kazuyoshi Yoshii, Tatsuya Kawahara

发表日期

2019/10/7

期刊

IEEE/ACM Transactions on Audio, Speech, and Language Processing

卷号

期号

页码范围

2197-2212

出版商

IEEE

简介

This paper describes a semi-supervised multichannel speech enhancement method that uses clean speech data for prior training. Although multichannel nonnegative matrix factorization (MNMF) and its constrained variant called independent low-rank matrix analysis (ILRMA) have successfully been used for unsupervised speech enhancement, the low-rank assumption on the power spectral densities (PSDs) of all sources (speech and noise) does not hold in reality. To solve this problem, we replace a low-rank speech model with a deep generative speech model, i.e., formulate a probabilistic model of noisy speech by integrating a deep speech model, a low-rank noise model, and a full-rank or rank-1 model of spatial characteristics of speech and noise. The deep speech model is trained from clean speech data in an unsupervised auto-encoding variational Bayesian manner. Given multichannel noisy speech …

引用总数

被引用次数：47

2020202120222023202410 11 12 9 5

学术搜索中的文章

Semi-supervised multichannel speech enhancement with a deep speech prior

K Sekiguchi, Y Bando, AA Nugraha, K Yoshii… - IEEE/ACM Transactions on Audio, Speech, and …, 2019

被引用次数：47 相关文章所有 5 个版本