Language-based audio retrieval task in DCASE 2022 challenge

H Xie, S Lipping, T Virtanen - arXiv preprint arXiv:2206.06108, 2022 - arxiv.org
Language-based audio retrieval is a task, where natural language textual captions are used
as queries to retrieve audio signals from a dataset. It has been first introduced into DCASE …

Audio retrieval with wavtext5k and clap training

S Deshmukh, B Elizalde, H Wang - arXiv preprint arXiv:2209.14275, 2022 - arxiv.org
Audio-Text retrieval takes a natural language query to retrieve relevant audio files in a
database. Conversely, Text-Audio retrieval takes an audio file as a query to retrieve relevant …

Audio retrieval with natural language queries: A benchmark study

AS Koepke, AM Oncescu, JF Henriques… - IEEE Transactions …, 2022 - ieeexplore.ieee.org
The objectives of this work are cross-modal text-audio and audio-text retrieval, in which the
goal is to retrieve the audio content from a pool of candidates that best matches a given …

[PDF][PDF] Language-based audio retrieval with pre-trained models

X Mei, X Liu, H Liu, J Sun, MD Plumbley… - … and Classification of …, 2022 - dcase.community
This technical report presents a language-based audio retrieval system that we submitted to
Detection and Classification of Acoustic Scenes and Events (DCASE) Challenge 2022 Task …

[PDF][PDF] IRIT-UPS DCASE 2023 audio captioning and retrieval system

E Labbé, T Pellegrini, J Pinquier - Proc. Conf. Detection …, 2023 - dcase.community
This technical report provides a concise overview of our systems submitted to the DCASE
Challenge 2023 for tasks 6a,” Automated Audio Captioning”(AAC), and 6b,” Language …

Audio retrieval with natural language queries

AM Oncescu, A Koepke, JF Henriques, Z Akata… - arXiv preprint arXiv …, 2021 - arxiv.org
We consider the task of retrieving audio using free-form natural language queries. To study
this problem, which has received limited attention in the existing literature, we introduce …

[PDF][PDF] HYU submission for the DCASE 2023 task 6a: Automated audio captioning model using AL-MixGen and synonyms substitution

JH Cho, YA Park, J Kim, JH Chang - Proc. Detection and …, 2023 - dcase.community
This paper presents the automated audio captioning model for participating in the detection
and classification of acoustic scenes and events 2023 challenge task 6A. The model …

[PDF][PDF] The DCASE 2021 challenge task 6 system: Automated audio captioning with weakly supervised pre-traing and word selection methods

W Yuan, Q Han, D Liu, X Li, Z Yang - Tech. Rep., DCASE …, 2021 - dcase.community
This technical report describes the system participating to the Detection and Classification of
Acoustic Scenes and Events (DCASE) 2021 Challenge, Task 6: automated audio …

[PDF][PDF] Automated audio captioning with weakly supervised pre-training and word selection methods.

Q Han, W Yuan, D Liu, X Li, Z Yang - DCASE, 2021 - dcase.community
Audio captioning is a multi-modal task, focusing on generating a natural sentence to
describe the content in an audio clip. This paper proposes a solution of automated audio …

Text-to-audio grounding: Building correspondence between captions and sound events

X Xu, H Dinkel, M Wu, K Yu - ICASSP 2021-2021 IEEE …, 2021 - ieeexplore.ieee.org
Automated Audio Captioning is a cross-modal task, generating natural language
descriptions to summarize the audio clips' sound events. However, grounding the actual …