Deep neural network techniques for monaural speech enhancement and separation: state of the art analysis

P Ochieng - Artificial Intelligence Review, 2023 - Springer
Deep neural networks (DNN) techniques have become pervasive in domains such as
natural language processing and computer vision. They have achieved great success in …

On loss functions for supervised monaural time-domain speech enhancement

M Kolbæk, ZH Tan, SH Jensen… - IEEE/ACM Transactions …, 2020 - ieeexplore.ieee.org
Many deep learning-based speech enhancement algorithms are designed to minimize the
mean-square error (MSE) in some transform domain between a predicted and a target …

A joint speech enhancement and self-supervised representation learning framework for noise-robust speech recognition

QS Zhu, J Zhang, ZQ Zhang… - IEEE/ACM Transactions …, 2023 - ieeexplore.ieee.org
Though speech enhancement (SE) can be used to improve speech quality in noisy
environments, it may also cause distortions that degrade the performance of automatic …

Speech enhancement using long short term memory with trained speech features and adaptive wiener filter

A Garg - Multimedia tools and applications, 2023 - Springer
Speech enhancement is the process of enhancing the clarity and intelligibility of speech
signals that have been degraded due to background noise. With the assistance of deep …

Multi-scale decomposition based supervised single channel deep speech enhancement

N Saleem, MI Khattak - Applied Soft Computing, 2020 - Elsevier
Speech signals reaching our ears are in general contaminated by the background noise
distortion which is detrimental to both speech quality and intelligibility. In this paper, we …

Attention-based speech enhancement using human quality perception modelling

KM Nayem, DS Williamson - IEEE/ACM Transactions on Audio …, 2023 - ieeexplore.ieee.org
Perceptually-inspired objective functions such as the perceptual evaluation of speech
quality (PESQ), signal-to-distortion ratio (SDR), and short-time objective intelligibility (STOI) …

A Multiscale Autoencoder (MSAE) Framework for End-to-End Neural Network Speech Enhancement

BJ Borgström, MS Brandstein - IEEE/ACM Transactions on …, 2024 - ieeexplore.ieee.org
Neural network approaches to single-channel speech enhancement have received much
recent attention. In particular, mask-based architectures have achieved significant …

End-to-end speech intelligibility prediction using time-domain fully convolutional neural networks

M Pedersen, M Kolbæk, AH Andersen, SH Jensen… - Interspeech 2020, 2020 - vbn.aau.dk
Data-driven speech intelligibility prediction has been slow totake off. Datasets of measured
speech intelligibility are scarce, and so current models are relatively small and rely on hand …

An Analysis of Traditional Noise Power Spectral Density Estimators Based on the Gaussian Stochastic Volatility Model

JK Nielsen, MG Christensen… - IEEE/ACM Transactions …, 2023 - ieeexplore.ieee.org
Many single-and multi-channel speech enhancement techniques, old and new, rely in one
way or another on estimates of the noise power spectral density (PSD). For example, the …

Optical Microphone-Based Speech Reconstruction System With Deep Learning for Individuals With Hearing Loss

YM Lin, JY Han, CH Lin, YH Lai - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Objective: Although many speech enhancement (SE) algorithms have been proposed to
promote speech perception in hearing-impaired patients, the conventional SE approaches …