A review of deep learning techniques for speech processing

A Mehrish, N Majumder, R Bharadwaj, R Mihalcea… - Information …, 2023 - Elsevier
The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …

Icassp 2023 deep noise suppression challenge

H Dubey, A Aazami, V Gopal, B Naderi… - IEEE Open Journal …, 2024 - ieeexplore.ieee.org
The ICASSP 2023 Deep Noise Suppression (DNS) Challenge marks the fifth edition of the
DNS challenge series. DNS challenges were organized from 2019 to 2023 to foster …

DNSMOS: A non-intrusive perceptual objective speech quality metric to evaluate noise suppressors

CKA Reddy, V Gopal, R Cutler - ICASSP 2021-2021 IEEE …, 2021 - ieeexplore.ieee.org
Human subjective evaluation is the" gold standard" to evaluate speech quality optimized for
human perception. Perceptual objective metrics serve as a proxy for subjective scores. The …

NISQA: A deep CNN-self-attention model for multidimensional speech quality prediction with crowdsourced datasets

G Mittag, B Naderi, A Chehadi, S Möller - arXiv preprint arXiv:2104.09494, 2021 - arxiv.org
In this paper, we present an update to the NISQA speech quality prediction model that is
focused on distortions that occur in communication networks. In contrast to the previous …

DNSMOS P. 835: A non-intrusive perceptual objective speech quality metric to evaluate noise suppressors

CKA Reddy, V Gopal, R Cutler - ICASSP 2022-2022 IEEE …, 2022 - ieeexplore.ieee.org
Human subjective evaluation is the" gold standard" to evaluate speech quality optimized for
human perception. Perceptual objective metrics serve as a proxy for subjective scores. We …

Deep learning-based non-intrusive multi-objective speech assessment model with cross-domain features

RE Zezario, SW Fu, F Chen, CS Fuh… - … on Audio, Speech …, 2022 - ieeexplore.ieee.org
This study proposes a cross-domain multi-objective speech assessment model, called
MOSA-Net, which can simultaneously estimate the speech quality, intelligibility, and …

NORESQA: A framework for speech quality assessment using non-matching references

P Manocha, B Xu, A Kumar - Advances in neural …, 2021 - proceedings.neurips.cc
The perceptual task of speech quality assessment (SQA) is a challenging task for machines
to do. Objective SQA methods that rely on the availability of the corresponding clean …

Torchaudio-squim: Reference-less speech quality and intelligibility measures in torchaudio

A Kumar, K Tan, Z Ni, P Manocha… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
Measuring quality and intelligibility of a speech signal is usually a critical step in
development of speech processing systems. To enable this, a variety of metrics to measure …

A study on incorporating Whisper for robust speech assessment

RE Zezario, YW Chen, SW Fu, Y Tsao… - … on Multimedia and …, 2024 - ieeexplore.ieee.org
This research introduces an enhanced version of the multi-objective speech assessment
model–MOSA-Net+, by leveraging the acoustic features from Whisper, a large-scaled …

Metricnet: Towards improved modeling for non-intrusive speech quality assessment

M Yu, C Zhang, Y Xu, S Zhang, D Yu - arXiv preprint arXiv:2104.01227, 2021 - arxiv.org
The objective speech quality assessment is usually conducted by comparing received
speech signal with its clean reference, while human beings are capable of evaluating the …