[HTML][HTML] A survey of sound source localization with deep learning methods

PA Grumiaux, S Kitić, L Girin, A Guérin - The Journal of the Acoustical …, 2022 - pubs.aip.org
This article is a survey of deep learning methods for single and multiple sound source
localization, with a focus on sound source localization in indoor environments, where …

Panns: Large-scale pretrained audio neural networks for audio pattern recognition

Q Kong, Y Cao, T Iqbal, Y Wang… - … on Audio, Speech …, 2020 - ieeexplore.ieee.org
Audio pattern recognition is an important research topic in the machine learning area, and
includes several tasks such as audio tagging, acoustic scene classification, music …

Overview and evaluation of sound event localization and detection in DCASE 2019

A Politis, A Mesaros, S Adavanne… - … on Audio, Speech …, 2020 - ieeexplore.ieee.org
Sound event localization and detection is a novel area of research that emerged from the
combined interest of analyzing the acoustic scene in terms of the spatial and temporal …

A framework for the robust evaluation of sound event detection

Ç Bilen, G Ferroni, F Tuveri, J Azcarreta… - ICASSP 2020-2020 …, 2020 - ieeexplore.ieee.org
This work defines a new framework for performance evaluation of polyphonic sound event
detection (SED) systems, which overcomes the limitations of the conventional collar-based …

Sound event detection of weakly labelled data with cnn-transformer and automatic threshold optimization

Q Kong, Y Xu, W Wang… - IEEE/ACM Transactions …, 2020 - ieeexplore.ieee.org
Sound event detection (SED) is a task to detect sound events in an audio recording. One
challenge of the SED task is that many datasets such as the Detection and Classification of …

A comprehensive survey of automated audio captioning

X Xu, M Wu, K Yu - arXiv preprint arXiv:2205.05357, 2022 - arxiv.org
Automated audio captioning, a task that mimics human perception as well as innovatively
links audio processing and natural language processing, has overseen much progress over …

Giantmidi-piano: A large-scale midi dataset for classical piano music

Q Kong, B Li, J Chen, Y Wang - arXiv preprint arXiv:2010.07061, 2020 - arxiv.org
Symbolic music datasets are important for music information retrieval and musical analysis.
However, there is a lack of large-scale symbolic datasets for classical piano music. In this …

Spying with your robot vacuum cleaner: eavesdropping via lidar sensors

S Sami, Y Dai, SRX Tan, N Roy, J Han - Proceedings of the 18th …, 2020 - dl.acm.org
Eavesdropping on private conversations is one of the most common yet detrimental threats
to privacy. A number of recent works have explored side-channels on smart devices for …

Polyphonic sound event detection and localization using a two-stage strategy

Y Cao, Q Kong, T Iqbal, F An, W Wang… - arXiv preprint arXiv …, 2019 - arxiv.org
Sound event detection (SED) and localization refer to recognizing sound events and
estimating their spatial and temporal locations. Using neural networks has become the …

Acoustic scene classification across cities and devices via feature disentanglement

Y Tan, H Ai, S Li, MD Plumbley - IEEE/ACM Transactions on …, 2024 - ieeexplore.ieee.org
Acoustic Scene Classification (ASC) is a task that classifies a scene according to
environmental acoustic signals. Audios collected from different cities and devices often …