Deep learning-based non-intrusive multi-objective speech assessment model with cross-domain features

RE Zezario, SW Fu, F Chen, CS Fuh… - … on Audio, Speech …, 2022 - ieeexplore.ieee.org
This study proposes a cross-domain multi-objective speech assessment model, called
MOSA-Net, which can simultaneously estimate the speech quality, intelligibility, and …

Dual-stream noise and speech information perception based speech enhancement

N Li, L Wang, Q Zhang, J Dang - Expert Systems with Applications, 2025 - Elsevier
In real-world scenarios, dynamic ambient noise often degrades speech quality, highlighting
the need for advanced speech enhancement techniques. Traditional methods, which rely on …

Cam: Context-aware masking for robust speaker verification

YQ Yu, S Zheng, H Suo, Y Lei… - ICASSP 2021-2021 IEEE …, 2021 - ieeexplore.ieee.org
Performance degradation caused by noise has been a long-standing challenge for speaker
verification. Previous methods usually involve applying a denoising transformation to …

The whole is greater than the sum of its parts: improving music source separation by bridging networks

R Sawata, N Takahashi, S Uhlich, S Takahashi… - EURASIP Journal on …, 2024 - Springer
This paper presents the crossing scheme (X-scheme) for improving the performance of deep
neural network (DNN)-based music source separation (MSS) with almost no increasing …

Nadiffuse: Noise-aware diffusion-based model for speech enhancement

W Wang, D Yang, Q Ye, B Cao… - 2023 Asia Pacific Signal …, 2023 - ieeexplore.ieee.org
The goal of speech enhancement (SE) is to eliminate the background interference from the
noisy speech signal. Generative models such as diffusion models (DM) have been applied …

Noise-Aware Extended U-Net With Split Encoder and Feature Refinement Module for Robust Speaker Verification in Noisy Environments

CY Lim, J Heo, JH Kim, HS Shin, HJ Yu - IEEE Access, 2024 - ieeexplore.ieee.org
Speech data gathered from real-world environments typically contain noise, a significant
element that undermines the performance of deep neural network-based speaker …

Text-informed knowledge distillation for robust speech enhancement and recognition

W Wang, W Zhang, S Lin, Y Qian - 2022 13th International …, 2022 - ieeexplore.ieee.org
Most existing speech enhancement (SE) approaches heavily depend on simulated data for
training, leading to performance degradation on realistic data and subsequent speech …

NASTAR: Noise Adaptive Speech Enhancement with Target-Conditional Resampling

CC Lee, CH Hu, YC Lin, CS Chen, HM Wang… - arXiv preprint arXiv …, 2022 - arxiv.org
For deep learning-based speech enhancement (SE) systems, the training-test acoustic
mismatch can cause notable performance degradation. To address the mismatch issue …

Hardware Efficient Speech Enhancement With Noise Aware Multi-Target Deep Learning

S Abdullah, M Zamanim… - IEEE Open Journal of …, 2024 - ieeexplore.ieee.org
This paper describes a supervised speech enhancement (SE) method utilising a noise-
aware four-layer deep neural network and training target switching. For optimal speech …

Speech enhancement with zero-shot model selection

RE Zezario, CS Fuh, HM Wang… - 2021 29th European …, 2021 - ieeexplore.ieee.org
Recent research on speech enhancement (SE) has seen the emergence of deep-learning-
based methods. It is still a challenging task to determine the effective ways to increase the …