A comprehensive survey of automated audio captioning

X Xu, M Wu, K Yu - arXiv preprint arXiv:2205.05357, 2022 - arxiv.org
Automated audio captioning, a task that mimics human perception as well as innovatively
links audio processing and natural language processing, has overseen much progress over …

Specaugment++: A hidden space data augmentation method for acoustic scene classification

H Wang, Y Zou, W Wang - arXiv preprint arXiv:2103.16858, 2021 - arxiv.org
In this paper, we present SpecAugment++, a novel data augmentation method for deep
neural networks based acoustic scene classification (ASC). Different from other popular data …

[PDF][PDF] CPJKU submission to dcase22: Distilling knowledge for lowcomplexity convolutional neural networks from a patchout audio transformer

F Schmid, S Masoudian, K Koutini… - Tech. Rep., Detection …, 2022 - dcase.community
In this technical report, we describe the CP-JKU team's submission for Task 1 Low-
Complexity Acoustic Scene Classification of the DCASE 22 challenge [1]. We use …

[PDF][PDF] Cpjku submission to dcase21: Cross-device audio scene classification with wide sparse frequency-damped CNNs

K Koutini, S Jan, G Widmer - Tech. Rep., 2021 - dcase.community
We describe the CP-JKU team's submission for Task 1A Low-Complexity Acoustic Scene
Classification with Multiple Devices [1] of the DCASE2021 Challenge. We use Receptive …

Lightweight deep neural networks for acoustic scene classification and an effective visualization for presenting sound scene contexts

L Pham, D Ngo, D Salovic, A Jalali, A Schindler… - Applied Acoustics, 2023 - Elsevier
In this paper, we propose lightweight deep neural networks for Acoustic Scene Classification
(ASC) and a visualization method for presenting a sound scene context. To this end, we first …

Robust, general, and low complexity acoustic scene classification systems and an effective visualization for presenting a sound scene context

L Pham, D Salovic, A Jalali, A Schindler, K Tran… - arXiv preprint arXiv …, 2022 - arxiv.org
In this paper, we present a comprehensive analysis of Acoustic Scene Classification (ASC),
the task of identifying the scene of an audio recording from its acoustic signature. In …

Wider or deeper neural network architecture for acoustic scene classification with mismatched recording devices

L Pham, K Tran, D Ngo, H Tang, S Phan… - Proceedings of the 4th …, 2022 - dl.acm.org
In this paper, we present a robust and low complexity model for Acoustic Scene
Classification (ASC), the task of identifying the scene of an audio recording. We firstly …

A variational Bayesian approach to learning latent variables for acoustic knowledge transfer

H Hu, SM Siniscalchi, CHH Yang… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org
We propose a variational Bayesian (VB) approach to learning distributions of latent
variables in deep neural network (DNN) models for cross-domain knowledge transfer, to …

On-line audio-to-lyrics alignment based on a reference performance

C Brazier, G Widmer - arXiv preprint arXiv:2107.14496, 2021 - arxiv.org
Audio-to-lyrics alignment has become an increasingly active research task in MIR,
supported by the emergence of several open-source datasets of audio recordings with word …

Memory-replay knowledge distillation

J Wang, P Zhang, Y Li - Sensors, 2021 - mdpi.com
Knowledge Distillation (KD), which transfers the knowledge from a teacher to a student
network by penalizing their Kullback–Leibler (KL) divergence, is a widely used tool for Deep …