Speech enhancement aided end-to-end multi-task learning for voice activity detection

X Tan, XL Zhang - … 2021-2021 IEEE International Conference on …, 2021 - ieeexplore.ieee.org
Robust voice activity detection (VAD) is a challenging task in low signal-to-noise (SNR)
environments. Recent studies show that speech enhancement is helpful to VAD, but the …

Should we always separate?: Switching between enhanced and observed signals for overlapping speech recognition

H Sato, T Ochiai, M Delcroix, K Kinoshita… - arXiv preprint arXiv …, 2021 - arxiv.org
Although recent advances in deep learning technology improved automatic speech
recognition (ASR), it remains difficult to recognize speech when it overlaps other people's …

Time-domain joint training strategies of speech enhancement and intent classification neural models

MN Ali, D Falavigna, A Brutti - Sensors, 2022 - mdpi.com
Robustness against background noise and reverberation is essential for many real-world
speech-based applications. One way to achieve this robustness is to employ a speech …

Joint training ResCNN-based voice activity detection with speech enhancement

T Xu, H Zhang, X Zhang - 2019 Asia-Pacific Signal and …, 2019 - ieeexplore.ieee.org
Voice activity detection (VAD) is considered as a solved problem in noise-free condition, but
it is still a challenging task in low signal-to-noise ratio (SNR) noisy conditions. Intuitively …

[PDF][PDF] Voice Activity Detection with Teacher-Student Domain Emulation.

J Luckenbaugh, S Abplanalp, R Gonzalez, D Fulford… - Interspeech, 2021 - isca-archive.org
Transfer learning is a promising approach to increase performance for many speech-based
systems, including voice activity detection (VAD). Domain adaptation, a subfield of transfer …

语音增强与检测的多任务学习方法研究.

王师琦, 曾庆宁, 龙超, 熊松龄… - Journal of Computer …, 2021 - search.ebscohost.com
在许多语音信号处理的实际应用中, 都要求系统能够低延迟地实时处理多个任务,
并且对噪声要有很强的鲁棒性. 针对上述问题, 提出了一种语音增强和语音活动检测(Voice …

[PDF][PDF] Exploiting magnitude and phase aware deep neural network for replay attack detection

K Phapatanaburi, P Buayai… - ECTI Transactions …, 2020 - pdfs.semanticscholar.org
Magnitude and phase aware deep neural network (MP-aware DNN) based on Fast Fourier
Transform information, has recently received more attention in many speech applications …

Neural Enhancement Strategies for Robust Speech Processing

MNAM Nawar - 2023 - iris.unitn.it
In real-world scenarios, speech signals are often contaminated with environmental noises,
and reverberation, which degrades speech quality and intelligibility. Lately, the development …