A review of deep learning techniques for speech processing

A Mehrish, N Majumder, R Bharadwaj, R Mihalcea… - Information …, 2023 - Elsevier
The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …

Sixty years of frequency-domain monaural speech enhancement: From traditional to deep learning methods

C Zheng, H Zhang, W Liu, X Luo, A Li, X Li… - Trends in …, 2023 - journals.sagepub.com
Frequency-domain monaural speech enhancement has been extensively studied for over
60 years, and a great number of methods have been proposed and applied to many …

Universal speech enhancement with score-based diffusion

J Serrà, S Pascual, J Pons, RO Araz… - arXiv preprint arXiv …, 2022 - arxiv.org
Removing background noise from speech audio has been the subject of considerable effort,
especially in recent years due to the rise of virtual communication and amateur recordings …

Dccrn+: Channel-wise subband dccrn with snr estimation for speech enhancement

S Lv, Y Hu, S Zhang, L Xie - arXiv preprint arXiv:2106.08672, 2021 - arxiv.org
Deep complex convolution recurrent network (DCCRN), which extends CRN with complex
structure, has achieved superior performance in MOS evaluation in Interspeech 2020 deep …

Multi-scale temporal frequency convolutional network with axial attention for speech enhancement

G Zhang, L Yu, C Wang, J Wei - ICASSP 2022-2022 IEEE …, 2022 - ieeexplore.ieee.org
Speech quality is often degraded by acoustic echoes, background noise, and reverberation.
In this paper, we propose a system consisting of deep learning and signal processing to …

Manner: Multi-view attention network for noise erasure

HJ Park, BH Kang, W Shin, JS Kim… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org
In the field of speech enhancement, time domain methods have difficulties in achieving both
high performance and efficiency. Recently, dual-path models have been adopted to …

Cmgan: Conformer-based metric-gan for monaural speech enhancement

S Abdulatif, R Cao, B Yang - IEEE/ACM Transactions on Audio …, 2024 - ieeexplore.ieee.org
In this work, we further develop the conformer-based metric generative adversarial network
(CMGAN) model 1 for speech enhancement (SE) in the time-frequency (TF) domain. This …

A time-frequency attention module for neural speech enhancement

Q Zhang, X Qian, Z Ni, A Nicolson… - … on Audio, Speech …, 2022 - ieeexplore.ieee.org
Speech enhancement plays an essential role in a wide range of speech processing
applications. Recent studies on speech enhancement tend to investigate how to effectively …

A nested u-net with self-attention and dense connectivity for monaural speech enhancement

X Xiang, X Zhang, H Chen - IEEE Signal Processing Letters, 2021 - ieeexplore.ieee.org
With the development of deep neural networks, speech enhancement technology has been
vastly improved. However, commonly used speech enhancement approaches cannot fully …

Interactive feature fusion for end-to-end noise-robust speech recognition

Y Hu, N Hou, C Chen, ES Chng - ICASSP 2022-2022 IEEE …, 2022 - ieeexplore.ieee.org
Speech enhancement (SE) aims to suppress the additive noise from noisy speech signals to
improve the speech's perceptual quality and intelligibility. However, the over-suppression …