A CNN-LSTM based deep learning model with high accuracy and robustness for carbon price forecasting: A case of Shenzhen's carbon market in China

H Shi, A Wei, X Xu, Y Zhu, H Hu, S Tang - Journal of Environmental …, 2024 - Elsevier
Accurately predicting carbon trading prices using deep learning models can help
enterprises understand the operational mechanisms and regulations of the carbon market …

A comprehensive empirical review of modern voice activity detection approaches for movies and TV shows

M Sharma, S Joshi, T Chatterjee, R Hamid - Neurocomputing, 2022 - Elsevier
A robust and language agnostic Voice Activity Detection (VAD) is crucial for Digital
Entertainment Content (DEC). Primary examples of DEC include movies and TV series …

Active Speaker Detection using Audio, Visual and Depth Modalities: A Survey

SNAM Robi, MAZM Ariffin, MAM Izhar, N Ahmad… - IEEE …, 2024 - ieeexplore.ieee.org
The rapid progress of multimodal signal processing in recent years has cleared the way for
novel applications in human-computer interaction, surveillance, and telecommunication …

Optimization of RNN-based speech activity detection

G Gelly, JL Gauvain - IEEE/ACM Transactions on Audio …, 2017 - ieeexplore.ieee.org
Speech activity detection (SAD) is an essential component of automatic speech recognition
systems impacting the overall system performance. This paper investigates an optimization …

Voice activity detection in the wild: A data-driven approach using teacher-student training

H Dinkel, S Wang, X Xu, M Wu… - IEEE/ACM Transactions …, 2021 - ieeexplore.ieee.org
Voice activity detection is an essential pre-processing component for speech-related tasks
such as automatic speech recognition (ASR). Traditional supervised VAD systems obtain …

All your fake detector are belong to us: evaluating adversarial robustness of fake-news detectors under black-box settings

H Ali, MS Khan, A AlGhadhban, M Alazmi… - IEEE …, 2021 - ieeexplore.ieee.org
With the hyperconnectivity and ubiquity of the Internet, the fake news problem now presents
a greater threat than ever before. One promising solution for countering this threat is to …

Acoustic data augmentation for Mandarin-English code-switching speech recognition

Y Long, Y Li, Q Zhang, S Wei, H Ye, J Yang - Applied Acoustics, 2020 - Elsevier
Code-switching (CS) is a multilingual phenomenon where a speaker uses different
languages in an utterance or between alternating utterances. Developing large-scale …

Multitask detection of speaker changes, overlapping speech and voice activity using wav2vec 2.0

M Kunešová, Z Zajíc - ICASSP 2023-2023 IEEE International …, 2023 - ieeexplore.ieee.org
Self-supervised learning approaches have lately achieved great success on a broad
spectrum of machine learning problems. In the field of speech processing, one of the most …

Automatic speech recognition systems: A survey of discriminative techniques

AP Kaur, A Singh, R Sachdeva… - Multimedia Tools and …, 2023 - search.proquest.com
In the subject of pattern recognition, speech recognition is an important study topic. The
authors give a detailed assessment of voice recognition strategies for several majority …

Voice activity detection in the wild via weakly supervised sound event detection

H Dinkel, Y Chen, M Wu, K Yu - arXiv preprint arXiv:2003.12222, 2020 - arxiv.org
Traditional supervised voice activity detection (VAD) methods work well in clean and
controlled scenarios, with performance severely degrading in real-world applications. One …