作者
Tomasz Grzywalski, Szymon Drgas
发表日期
2019/5/12
研讨会论文
ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
页码范围
6970-6974
出版商
IEEE
简介
When designing fully-convolutional neural network, there is a trade-off between receptive field size, number of parameters and spatial resolution of features in deeper layers of the network. In this work we present a novel network design based on combination of many convolutional and recurrent layers that solves these dilemmas. We compare our solution with U-nets based models known from the literature and other baseline models on speech enhancement task. We test our solution on TIMIT speech utterances combined with noise segments extracted from NOISEX-92 database and show clear advantage of proposed solution in terms of SDR (signal-to-distortion ratio), SIR (signal-to-interference ratio) and STOI (spectro-temporal objective intelligibility) metrics compared to the current state-of-the-art.
引用总数
2019202020212022202316654
学术搜索中的文章