Deep neural network techniques for monaural speech enhancement and separation: state of the art analysis

P Ochieng - Artificial Intelligence Review, 2023 - Springer
Deep neural networks (DNN) techniques have become pervasive in domains such as
natural language processing and computer vision. They have achieved great success in …

TF-GridNet: Integrating full-and sub-band modeling for speech separation

ZQ Wang, S Cornell, S Choi, Y Lee… - … on Audio, Speech …, 2023 - ieeexplore.ieee.org
We propose TF-GridNet for speech separation. The model is a novel deep neural network
(DNN) integrating full-and sub-band modeling in the time-frequency (TF) domain. It stacks …

A review on speech separation in cocktail party environment: challenges and approaches

J Agrawal, M Gupta, H Garg - Multimedia Tools and Applications, 2023 - Springer
The Cocktail party problem, which is tracing and identifying a specific speaker's speech
while numerous speakers communicate concurrently is one of the crucial problems still to be …

Monaural source separation: From anechoic to reverberant environments

T Cord-Landwehr, C Boeddeker… - … on acoustic signal …, 2022 - ieeexplore.ieee.org
Impressive progress in neural network-based single-channel speech source separation has
been made in recent years. But those improvements have been mostly reported on anechoic …

Unifying speech enhancement and separation with gradient modulation for end-to-end noise-robust speech separation

Y Hu, C Chen, H Zou, X Zhong… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
Recent studies in neural network-based monaural speech separation (SS) have achieved a
remarkable success thanks to increasing ability of long sequence modeling. However, they …

On data sampling strategies for training neural network speech separation models

W Ravenscroft, S Goetze, T Hain - 2023 31st European Signal …, 2023 - ieeexplore.ieee.org
Speech separation remains an important area of multi-speaker signal processing. Deep
neural network (DNN) models have attained the best performance on many speech …

Single-microphone speaker separation and voice activity detection in noisy and reverberant environments

R Opochinsky, M Moradi, S Gannot - arXiv preprint arXiv:2401.03448, 2024 - arxiv.org
Speech separation involves extracting an individual speaker's voice from a multi-speaker
audio signal. The increasing complexity of real-world environments, where multiple …

Audio denoising for robust audio fingerprinting

K Akesbi - arXiv preprint arXiv:2212.11277, 2022 - arxiv.org
Music discovery services let users identify songs from short mobile recordings. These
solutions are often based on Audio Fingerprinting, and rely more specifically on the …

ViT-LSTM synergy: a multi-feature approach for speaker identification and mask detection

AB Nassif, I Shahin, M Bader, A Ahmed… - Neural Computing and …, 2024 - Springer
The global health crisis caused by the COVID-19 pandemic has brought new challenges to
speaker identification systems, particularly due to the acoustic alterations caused by the …

AmbiSep: Joint Ambisonic-to-Ambisonic Speech Separation and Noise Reduction

A Herzog, SR Chetupalli… - IEEE/ACM Transactions …, 2023 - ieeexplore.ieee.org
Blind separation of the sounds in an Ambisonic sound scene is a challenging problem,
especially when the spatial impression of these sounds needs to be preserved. In this work …