Our goal is to collect a large-scale audio-visual dataset with low label noise from videosin the wild'using computer vision techniques. The resulting dataset can be used for training …
One of the most amazing functions of the human auditory system is the ability to detect all kinds of sound events in the environment. With the technologies and hardware advances …
Audio-visual speech recognition (AVSR) is one of the most promising solutions for reliable speech recognition, particularly when audio is corrupted by noise. Additional visual …
Y Wang, J Li, F Metze - ICASSP 2019-2019 IEEE International …, 2019 - ieeexplore.ieee.org
Sound event detection (SED) entails two subtasks: recognizing what types of sound events are present in an audio stream (audio tagging), and pinpointing their onset and offset times …
This paper describes Task 2 of the DCASE 2018 Challenge, titled" General-purpose audio tagging of Freesound content with AudioSet labels". This task was hosted on the Kaggle …
Sound event detection (SED) methods are tasked with labeling segments of audio recordings by the presence of active sound sources. SED is typically posed as a supervised …
Q Kong, Y Xu, W Wang… - IEEE/ACM Transactions …, 2020 - ieeexplore.ieee.org
Sound event detection (SED) is a task to detect sound events in an audio recording. One challenge of the SED task is that many datasets such as the Detection and Classification of …
Presently, most deaths are caused by heart disease. To overcome this situation, heartbeat sound analysis is a convenient way to diagnose heart disease. Heartbeat sound …
This paper presents DCASE 2018 task 4. The task evaluates systems for the large-scale detection of sound events using weakly labeled data (without time boundaries). The target of …