Large-scale weakly supervised audio classification using gated convolutional neural network

N Akhtar, U Ragavendran - Neural computing and applications, 2020 - Springer

The convolutional neural network architecture has different components like convolution and
pooling. The pooling is crucial component placed after the convolution layer. It plays a vital …

被引用次数：165 相关文章所有 4 个版本

[PDF] arxiv.org

Vggsound: A large-scale audio-visual dataset

H Chen, W Xie, A Vedaldi… - ICASSP 2020-2020 IEEE …, 2020 - ieeexplore.ieee.org

Our goal is to collect a large-scale audio-visual dataset with low label noise from videosin
the wild'using computer vision techniques. The resulting dataset can be used for training …

被引用次数：456 相关文章所有 10 个版本

[PDF] ieee.org

A comprehensive review of polyphonic sound event detection

TK Chan, CS Chin - IEEE Access, 2020 - ieeexplore.ieee.org

One of the most amazing functions of the human auditory system is the ability to detect all
kinds of sound events in the environment. With the technologies and hardware advances …

被引用次数：55 相关文章所有 5 个版本

[PDF] mdpi.com

Audio-visual speech and gesture recognition by sensors of mobile devices

D Ryumin, D Ivanko, E Ryumina - Sensors, 2023 - mdpi.com

Audio-visual speech recognition (AVSR) is one of the most promising solutions for reliable
speech recognition, particularly when audio is corrupted by noise. Additional visual …

被引用次数：42 相关文章所有 9 个版本

[PDF] cmu.edu

A comparison of five multiple instance learning pooling functions for sound event detection with weak labeling

Y Wang, J Li, F Metze - ICASSP 2019-2019 IEEE International …, 2019 - ieeexplore.ieee.org

Sound event detection (SED) entails two subtasks: recognizing what types of sound events
are present in an audio stream (audio tagging), and pinpointing their onset and offset times …

被引用次数：203 相关文章所有 6 个版本

[PDF] arxiv.org

General-purpose tagging of freesound audio with audioset labels: Task description, dataset, and baseline

E Fonseca, M Plakal, F Font, DPW Ellis… - arXiv preprint arXiv …, 2018 - arxiv.org

This paper describes Task 2 of the DCASE 2018 Challenge, titled" General-purpose audio
tagging of Freesound content with AudioSet labels". This task was hosted on the Kaggle …

被引用次数：184 相关文章所有 11 个版本

[PDF] ieee.org

Adaptive pooling operators for weakly labeled sound event detection

B McFee, J Salamon, JP Bello - IEEE/ACM Transactions on …, 2018 - ieeexplore.ieee.org

Sound event detection (SED) methods are tasked with labeling segments of audio
recordings by the presence of active sound sources. SED is typically posed as a supervised …

被引用次数：195 相关文章所有 8 个版本

[PDF] arxiv.org

Sound event detection of weakly labelled data with cnn-transformer and automatic threshold optimization

Q Kong, Y Xu, W Wang… - IEEE/ACM Transactions …, 2020 - ieeexplore.ieee.org

Sound event detection (SED) is a task to detect sound events in an audio recording. One
challenge of the SED task is that many datasets such as the Detection and Classification of …

被引用次数：129 相关文章所有 8 个版本

[PDF] mdpi.com

Heartbeat sound signal classification using deep learning

A Raza, A Mehmood, S Ullah, M Ahmad, GS Choi… - Sensors, 2019 - mdpi.com

Presently, most deaths are caused by heart disease. To overcome this situation, heartbeat
sound analysis is a convenient way to diagnose heart disease. Heartbeat sound …

被引用次数：130 相关文章所有 13 个版本

[PDF] arxiv.org

Large-scale weakly labeled semi-supervised sound event detection in domestic environments

R Serizel, N Turpault, H Eghbal-Zadeh… - arXiv preprint arXiv …, 2018 - arxiv.org

This paper presents DCASE 2018 task 4. The task evaluates systems for the large-scale
detection of sound events using weakly labeled data (without time boundaries). The target of …

被引用次数：166 相关文章所有 12 个版本

高级搜索

QQ 群