Interpretation of intelligence in CNN-pooling processes: a methodological survey

N Akhtar, U Ragavendran - Neural computing and applications, 2020 - Springer
The convolutional neural network architecture has different components like convolution and
pooling. The pooling is crucial component placed after the convolution layer. It plays a vital …

Vggsound: A large-scale audio-visual dataset

H Chen, W Xie, A Vedaldi… - ICASSP 2020-2020 IEEE …, 2020 - ieeexplore.ieee.org
Our goal is to collect a large-scale audio-visual dataset with low label noise from videosin
the wild'using computer vision techniques. The resulting dataset can be used for training …

A comprehensive review of polyphonic sound event detection

TK Chan, CS Chin - IEEE Access, 2020 - ieeexplore.ieee.org
One of the most amazing functions of the human auditory system is the ability to detect all
kinds of sound events in the environment. With the technologies and hardware advances …

Audio-visual speech and gesture recognition by sensors of mobile devices

D Ryumin, D Ivanko, E Ryumina - Sensors, 2023 - mdpi.com
Audio-visual speech recognition (AVSR) is one of the most promising solutions for reliable
speech recognition, particularly when audio is corrupted by noise. Additional visual …

A comparison of five multiple instance learning pooling functions for sound event detection with weak labeling

Y Wang, J Li, F Metze - ICASSP 2019-2019 IEEE International …, 2019 - ieeexplore.ieee.org
Sound event detection (SED) entails two subtasks: recognizing what types of sound events
are present in an audio stream (audio tagging), and pinpointing their onset and offset times …

General-purpose tagging of freesound audio with audioset labels: Task description, dataset, and baseline

E Fonseca, M Plakal, F Font, DPW Ellis… - arXiv preprint arXiv …, 2018 - arxiv.org
This paper describes Task 2 of the DCASE 2018 Challenge, titled" General-purpose audio
tagging of Freesound content with AudioSet labels". This task was hosted on the Kaggle …

Adaptive pooling operators for weakly labeled sound event detection

B McFee, J Salamon, JP Bello - IEEE/ACM Transactions on …, 2018 - ieeexplore.ieee.org
Sound event detection (SED) methods are tasked with labeling segments of audio
recordings by the presence of active sound sources. SED is typically posed as a supervised …

Sound event detection of weakly labelled data with cnn-transformer and automatic threshold optimization

Q Kong, Y Xu, W Wang… - IEEE/ACM Transactions …, 2020 - ieeexplore.ieee.org
Sound event detection (SED) is a task to detect sound events in an audio recording. One
challenge of the SED task is that many datasets such as the Detection and Classification of …

Heartbeat sound signal classification using deep learning

A Raza, A Mehmood, S Ullah, M Ahmad, GS Choi… - Sensors, 2019 - mdpi.com
Presently, most deaths are caused by heart disease. To overcome this situation, heartbeat
sound analysis is a convenient way to diagnose heart disease. Heartbeat sound …

Large-scale weakly labeled semi-supervised sound event detection in domestic environments

R Serizel, N Turpault, H Eghbal-Zadeh… - arXiv preprint arXiv …, 2018 - arxiv.org
This paper presents DCASE 2018 task 4. The task evaluates systems for the large-scale
detection of sound events using weakly labeled data (without time boundaries). The target of …