Performance analysis of various training targets for improving speech quality and intelligibility

S Sivapatham, A Kar, R Ramadoss - Applied Acoustics, 2021 - Elsevier
Denoising a single-channel speech (recorded using one microphone) remains an open
problem in many speech-related applications. Recently, supervised deep learning methods
are used to denoise the speech signal. This work uses Deep Neural Network (DNN) to learn
the Time–Frequency (TF) mask of the clean speech from its noisy speech features. In
general, Ideal Binary Mask (IBM) is used as the binary mask training target to improve
speech intelligibility, and Ideal Ratio Mask (IRM) is used as a non-binary mask training …
以上显示的是最相近的搜索结果。 查看全部搜索结果