Voicefixer: A unified framework for high-fidelity speech restoration

H Liu, X Liu, Q Kong, Q Tian, Y Zhao, DL Wang… - arXiv preprint arXiv …, 2022 - arxiv.org
Speech restoration aims to remove distortions in speech signals. Prior methods mainly focus
on a single type of distortion, such as speech denoising or dereverberation. However …

Fusing bone-conduction and air-conduction sensors for complex-domain speech enhancement

H Wang, X Zhang, DL Wang - IEEE/ACM transactions on audio …, 2022 - ieeexplore.ieee.org
Speech enhancement aims to improve the listening quality and intelligibility of noisy speech
in adverse environments. It proves to be challenging to perform speech enhancement in …

Estimation and Voicing Detection With Cascade Architecture in Noisy Speech

Y Zhang, H Wang, DL Wang - IEEE/ACM Transactions on …, 2023 - ieeexplore.ieee.org
As a fundamental problem in speech processing, pitch tracking has been studied for
decades. While strong performance has been achieved on clean speech, pitch tracking in …

Attention-based fusion for bone-conducted and air-conducted speech enhancement in the complex domain

H Wang, X Zhang, DL Wang - ICASSP 2022-2022 IEEE …, 2022 - ieeexplore.ieee.org
Bone-conduction (BC) microphones capture speech signals by converting the vibrations of
the human skull into electrical signals. BC sensors are insensitive to acoustic noise, but …

Neural cascade architecture with triple-domain loss for speech enhancement

H Wang, DL Wang - IEEE/ACM transactions on audio, speech …, 2021 - ieeexplore.ieee.org
This paper proposes a neural cascade architecture to address the monaural speech
enhancement problem. The cascade architecture is composed of three modules which …

Phase continuity: Learning derivatives of phase spectrum for speech enhancement

D Kim, H Han, HK Shin, SW Chung… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org
Modern neural speech enhancement models usually include various forms of phase
information in their training loss terms, either explicitly or implicitly. However, these loss …

Multi-resolution Convolutional Residual Neural Networks for Monaural Speech Dereverberation

L Zhao, W Zhu, S Li, H Luo, XL Zhang… - … /ACM Transactions on …, 2024 - ieeexplore.ieee.org
It is known that the reverberant speech in different acoustic environments varies according to
reverberation time. However, most deep learning based speech dereverberation methods …

A systematic study of DNN based speech enhancement in reverberant and reverberant-noisy environments

H Wang, A Pandey, DL Wang - Computer Speech & Language, 2024 - Elsevier
Deep learning has led to dramatic performance improvements for the task of speech
enhancement, where deep neural networks (DNNs) are trained to recover clean speech …

Multichannel Linear Prediction-Based Speech Dereverberation Considering Sparse and Low-Rank Priors

T Wang, F Yang, J Yang - IEEE/ACM Transactions on Audio …, 2024 - ieeexplore.ieee.org
This article addresses the multi-channel linear prediction (MCLP)-based speech
dereverberation problem by jointly considering the sparsity and low-rank priors of speech …

Mitigating Domain Dependency for Improved Speech Enhancement Via SNR Loss Boosting

L Yin, D Wu, Z Qiu, H Huang - ICASSP 2023-2023 IEEE …, 2023 - ieeexplore.ieee.org
Current supervised speech enhancement methods based on deep learning typically utilize
amplitude-based loss functions for optimization, such as Mean Absolute Error (MAE) or …