Voice activity detection in the wild: A data-driven approach using teacher-student training

H Dinkel, S Wang, X Xu, M Wu… - IEEE/ACM Transactions …, 2021 - ieeexplore.ieee.org
Voice activity detection is an essential pre-processing component for speech-related tasks
such as automatic speech recognition (ASR). Traditional supervised VAD systems obtain …

Fault diagnosis of machines using deep convolutional beta-variational autoencoder

G Dewangan, S Maurya - IEEE Transactions on Artificial …, 2021 - ieeexplore.ieee.org
Industries are using fault diagnosis methods to prevent any downtime, which eventually led
them to make profits and take necessary steps beforehand to avoid any mishaps. In recent …

Speech enhancement aided end-to-end multi-task learning for voice activity detection

X Tan, XL Zhang - … 2021-2021 IEEE International Conference on …, 2021 - ieeexplore.ieee.org
Robust voice activity detection (VAD) is a challenging task in low signal-to-noise (SNR)
environments. Recent studies show that speech enhancement is helpful to VAD, but the …

Missing data imputation using an iterative denoising autoencoder (IDAE) for dissolved gas analysis

B Seo, J Shin, T Kim, BD Youn - Electric Power Systems Research, 2022 - Elsevier
With the expansion of the energy market, safe and stable operation of the electrical power
system has become an important issue. In an effort to achieve this goal, much research has …

A unified deep learning framework for short-duration speaker verification in adverse environments

Y Jung, Y Choi, H Lim, H Kim - IEEE Access, 2020 - ieeexplore.ieee.org
Speaker verification (SV) has recently attracted considerable research interest due to the
growing popularity of virtual assistants. At the same time, there is an increasing requirement …

AUC optimization for deep learning-based voice activity detection

XL Zhang, M Xu - EURASIP Journal on Audio, Speech, and Music …, 2022 - Springer
Voice activity detection (VAD) based on deep neural networks (DNN) have demonstrated
good performance in adverse acoustic environments. Current DNN-based VAD optimizes a …

Audio-visual speech recognition based on joint training with audio-visual speech enhancement for robust speech recognition

JW Hwang, J Park, RH Park, HM Park - Applied Acoustics, 2023 - Elsevier
Visual features are attractive cues that can be used for robust automatic speech recognition
(ASR). In particular, speech recognition performance can be improved by combining audio …

Notes on the use of variational autoencoders for speech and audio spectrogram modeling

L Girin, F Roche, T Hueber, S Leglaive - DAFx 2019-22nd International …, 2019 - hal.science
Variational autoencoders (VAEs) are powerful (deep) generative artificial neural networks.
They have been recently used in several papers for speech and audio processing, in …

Denoising convolutional variational autoencoders-based feature learning for automatic detection of plant diseases

V Zilvan, A Ramdan, E Suryawati… - … on Informatics and …, 2019 - ieeexplore.ieee.org
Early detection is critical for maintaining quantity and quality of farming commodity.
Currently, detection of plant diseases still requires human expertise and/or need …

A robust and lightweight voice activity detection algorithm for speech enhancement at low signal-to-noise ratio

Z Zhu, L Zhang, K Pei, S Chen - Digital Signal Processing, 2023 - Elsevier
Abstract Voice Activity Detection (VAD) is a crucial component of Speech Enhancement (SE)
for accurately estimating noise, which directly affects the SE effectiveness in improving …