Fsd50k: an open dataset of human-labeled sound events

E Fonseca, X Favory, J Pons, F Font… - IEEE/ACM Transactions …, 2021 - ieeexplore.ieee.org
Most existing datasets for sound event recognition (SER) are relatively small and/or domain-
specific, with the exception of AudioSet, based on over 2 M tracks from YouTube videos and …

A systematic literature review on the use of deep learning in precision livestock detection and localization using unmanned aerial vehicles

DBM Yousefi, ASM Rafie, SAR Al-Haddad… - Ieee …, 2022 - ieeexplore.ieee.org
With the ever-increasing importance of dairy and meat production, precision livestock
farming (PLF) using advanced information technologies is emerging to improve farming …

Psla: Improving audio tagging with pretraining, sampling, labeling, and aggregation

Y Gong, YA Chung, J Glass - IEEE/ACM Transactions on Audio …, 2021 - ieeexplore.ieee.org
Audio tagging is an active research area and has a wide range of applications. Since the
release of AudioSet, great progress has been made in advancing model performance, which …

Sound event detection in domestic environments with weakly labeled data and soundscape synthesis

N Turpault, R Serizel, AP Shah… - Workshop on Detection …, 2019 - inria.hal.science
This paper presents Task 4 of the Detection and Classification of Acoustic Scenes and
Events (DCASE) 2019 challenge and provides a first analysis of the challenge results. The …

A comparison of five multiple instance learning pooling functions for sound event detection with weak labeling

Y Wang, J Li, F Metze - ICASSP 2019-2019 IEEE International …, 2019 - ieeexplore.ieee.org
Sound event detection (SED) entails two subtasks: recognizing what types of sound events
are present in an audio stream (audio tagging), and pinpointing their onset and offset times …

Learning sound event classifiers from web audio with noisy labels

E Fonseca, M Plakal, DPW Ellis, F Font… - ICASSP 2019-2019 …, 2019 - ieeexplore.ieee.org
As sound event classification moves towards larger datasets, issues of label noise become
inevitable. Web sites can supply large volumes of user-contributed audio and metadata, but …

Weakly-supervised audio-visual segmentation

S Mo, B Raj - Advances in Neural Information Processing …, 2024 - proceedings.neurips.cc
Audio-visual segmentation is a challenging task that aims to predict pixel-level masks for
sound sources in a video. Previous work applied a comprehensive manually designed …

Training sound event detection on a heterogeneous dataset

N Turpault, R Serizel - arXiv preprint arXiv:2007.03931, 2020 - arxiv.org
Training a sound event detection algorithm on a heterogeneous dataset including both
recorded and synthetic soundscapes that can have various labeling granularity is a non …

Threshold independent evaluation of sound event detection scores

J Ebbers, R Haeb-Umbach… - ICASSP 2022-2022 IEEE …, 2022 - ieeexplore.ieee.org
Performing an adequate evaluation of sound event detection (SED) systems is far from trivial
and is still subject to ongoing research. The recently proposed polyphonic sound detection …

Cmkd: Cnn/transformer-based cross-model knowledge distillation for audio classification

Y Gong, S Khurana, A Rouditchenko… - arXiv preprint arXiv …, 2022 - arxiv.org
Audio classification is an active research area with a wide range of applications. Over the
past decade, convolutional neural networks (CNNs) have been the de-facto standard …