查看文章

arxiv.org 中的 [PDF]

Using recurrences in time and frequency within U-net architecture for speech enhancement

作者

Tomasz Grzywalski, Szymon Drgas

发表日期

2019/5/12

研讨会论文

ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

页码范围

6970-6974

出版商

IEEE

简介

When designing fully-convolutional neural network, there is a trade-off between receptive field size, number of parameters and spatial resolution of features in deeper layers of the network. In this work we present a novel network design based on combination of many convolutional and recurrent layers that solves these dilemmas. We compare our solution with U-nets based models known from the literature and other baseline models on speech enhancement task. We test our solution on TIMIT speech utterances combined with noise segments extracted from NOISEX-92 database and show clear advantage of proposed solution in terms of SDR (signal-to-distortion ratio), SIR (signal-to-interference ratio) and STOI (spectro-temporal objective intelligibility) metrics compared to the current state-of-the-art.

引用总数

被引用次数：22

201920202021202220231 6 6 5 4

学术搜索中的文章

Using recurrences in time and frequency within U-net architecture for speech enhancement

T Grzywalski, S Drgas - ICASSP 2019-2019 IEEE International Conference on …, 2019

被引用次数：22 相关文章所有 5 个版本