A comparative study of robustness of deep learning approaches for VAD

A CNN-LSTM based deep learning model with high accuracy and robustness for carbon price forecasting: A case of Shenzhen's carbon market in China

H Shi, A Wei, X Xu, Y Zhu, H Hu, S Tang - Journal of Environmental …, 2024 - Elsevier

Accurately predicting carbon trading prices using deep learning models can help
enterprises understand the operational mechanisms and regulations of the carbon market …

被引用次数：45 相关文章所有 4 个版本

[PDF] amazon.science

A comprehensive empirical review of modern voice activity detection approaches for movies and TV shows

M Sharma, S Joshi, T Chatterjee, R Hamid - Neurocomputing, 2022 - Elsevier

A robust and language agnostic Voice Activity Detection (VAD) is crucial for Digital
Entertainment Content (DEC). Primary examples of DEC include movies and TV series …

被引用次数：21 相关文章所有 4 个版本

[PDF] ieee.org

Active Speaker Detection using Audio, Visual and Depth Modalities: A Survey

SNAM Robi, MAZM Ariffin, MAM Izhar, N Ahmad… - IEEE …, 2024 - ieeexplore.ieee.org

The rapid progress of multimodal signal processing in recent years has cleared the way for
novel applications in human-computer interaction, surveillance, and telecommunication …

被引用次数：2 相关文章所有 2 个版本

Optimization of RNN-based speech activity detection

G Gelly, JL Gauvain - IEEE/ACM Transactions on Audio …, 2017 - ieeexplore.ieee.org

Speech activity detection (SAD) is an essential component of automatic speech recognition
systems impacting the overall system performance. This paper investigates an optimization …

被引用次数：121 相关文章所有 3 个版本

[PDF] arxiv.org

Voice activity detection in the wild: A data-driven approach using teacher-student training

H Dinkel, S Wang, X Xu, M Wu… - IEEE/ACM Transactions …, 2021 - ieeexplore.ieee.org

Voice activity detection is an essential pre-processing component for speech-related tasks
such as automatic speech recognition (ASR). Traditional supervised VAD systems obtain …

被引用次数：55 相关文章所有 5 个版本

[PDF] ieee.org

All your fake detector are belong to us: evaluating adversarial robustness of fake-news detectors under black-box settings

H Ali, MS Khan, A AlGhadhban, M Alazmi… - IEEE …, 2021 - ieeexplore.ieee.org

With the hyperconnectivity and ubiquity of the Internet, the fake news problem now presents
a greater threat than ever before. One promising solution for countering this threat is to …

被引用次数：56 相关文章所有 13 个版本

Acoustic data augmentation for Mandarin-English code-switching speech recognition

Y Long, Y Li, Q Zhang, S Wei, H Ye, J Yang - Applied Acoustics, 2020 - Elsevier

Code-switching (CS) is a multilingual phenomenon where a speaker uses different
languages in an utterance or between alternating utterances. Developing large-scale …

被引用次数：41 相关文章

[PDF] arxiv.org

Multitask detection of speaker changes, overlapping speech and voice activity using wav2vec 2.0

M Kunešová, Z Zajíc - ICASSP 2023-2023 IEEE International …, 2023 - ieeexplore.ieee.org

Self-supervised learning approaches have lately achieved great success on a broad
spectrum of machine learning problems. In the field of speech processing, one of the most …

被引用次数：19 相关文章所有 3 个版本

Automatic speech recognition systems: A survey of discriminative techniques

AP Kaur, A Singh, R Sachdeva… - Multimedia Tools and …, 2023 - search.proquest.com

In the subject of pattern recognition, speech recognition is an important study topic. The
authors give a detailed assessment of voice recognition strategies for several majority …

被引用次数：10 相关文章所有 3 个版本

[PDF] arxiv.org

Voice activity detection in the wild via weakly supervised sound event detection

H Dinkel, Y Chen, M Wu, K Yu - arXiv preprint arXiv:2003.12222, 2020 - arxiv.org

Traditional supervised voice activity detection (VAD) methods work well in clean and
controlled scenarios, with performance severely degrading in real-world applications. One …

被引用次数：33 相关文章所有 8 个版本

高级搜索

QQ 群