With the development and increasing deployment of smart home devices, voice control supports comfortable end user interactions. However, potential end users may refuse to use …
Most existing datasets for sound event recognition (SER) are relatively small and/or domain- specific, with the exception of AudioSet, based on over 2 M tracks from YouTube videos and …
Audio pattern recognition is an important research topic in the machine learning area, and includes several tasks such as audio tagging, acoustic scene classification, music …
The ability of artificial intelligence (AI) systems to perceive and comprehend audio signals is crucial for many applications. Although significant progress has been made in this area …
Acoustic data provide scientific and engineering insights in fields ranging from biology and communications to ocean and Earth science. We survey the recent advances and …
In this paper, we propose a convolutional recurrent neural network for joint sound event localization and detection (SELD) of multiple overlapping sound events in three-dimensional …
Y Gong, YA Chung, J Glass - IEEE/ACM Transactions on Audio …, 2021 - ieeexplore.ieee.org
Audio tagging is an active research area and has a wide range of applications. Since the release of AudioSet, great progress has been made in advancing model performance, which …
We introduce PixelPlayer, a system that, by leveraging large amounts of unlabeled videos, learns to locate image regions which produce sounds and separate the input sounds into a …
CD Kim, B Kim, H Lee, G Kim - … of the 2019 Conference of the …, 2019 - aclanthology.org
We explore the problem of Audio Captioning: generating natural language description for any kind of audio in the wild, which has been surprisingly unexplored in previous research …