Foundation models for music: A survey

Y Ma, A Øland, A Ragni, BMS Del Sette, C Saitis… - arXiv preprint arXiv …, 2024 - arxiv.org
In recent years, foundation models (FMs) such as large language models (LLMs) and latent
diffusion models (LDMs) have profoundly impacted diverse sectors, including music. This …

Contrastive audio-language learning for music

I Manco, E Benetos, E Quinton, G Fazekas - arXiv preprint arXiv …, 2022 - arxiv.org
As one of the most intuitive interfaces known to humans, natural language has the potential
to mediate many tasks that involve human-computer interaction, especially in application …

MSCCov19Net: multi-branch deep learning model for COVID-19 detection from cough sounds

S Ulukaya, AA Sarıca, O Erdem, A Karaali - Medical & Biological …, 2023 - Springer
Coronavirus has an impact on millions of lives and has been added to the important
pandemics that continue to affect with its variants. Since it is transmitted through the …

Multimodal music information processing and retrieval: Survey and future challenges

F Simonetta, S Ntalampiras… - … workshop on multilayer …, 2019 - ieeexplore.ieee.org
Towards improving the performance in various music information processing tasks, recent
studies exploit different modalities able to capture diverse aspects of music. Such modalities …

Audio-based musical version identification: Elements and challenges

F Yesiler, G Doras, RM Bittner… - IEEE Signal …, 2021 - ieeexplore.ieee.org
Creating novel interpretations of existing musical compositions is and has always been an
essential part of musical practice. Before the advent of recorded music, listening to a piece of …

How blind and visually impaired composers, producers, and songwriters leverage and adapt music technology

WC Payne, AY Xu, F Ahmed, L Ye, A Hurst - Proceedings of the 22nd …, 2020 - dl.acm.org
Today, music creation software and hardware are central to the workflow of most
professional composers, producers, and songwriters. Music is an aural art form, but it is …

[PDF][PDF] Query by Video: Cross-modal Music Retrieval.

B Li, A Kumar - ISMIR, 2019 - academia.edu
Cross-modal retrieval learns the relationship between the two types of data in a common
space so that an input from one modality can retrieve data from a different modality. We …

[PDF][PDF] Erkomaishvili Dataset: A Curated Corpus of Traditional Georgian Vocal Music for Computational Musicology.

S Rosenzweig, F Scherbaum… - Trans. Int. Soc …, 2020 - pdfs.semanticscholar.org
The analysis of recorded audio material using computational methods has received
increased attention in ethnomusicological research. We present a curated dataset of …

Discover: Disentangled music representation learning for cover song identification

J Xun, S Zhang, Y Yang, J Zhu, L Deng… - Proceedings of the 46th …, 2023 - dl.acm.org
In the field of music information retrieval (MIR), cover song identification (CSI) is a
challenging task that aims to identify cover versions of a query song from a massive …

Learning explicit and implicit dual common subspaces for audio-visual cross-modal retrieval

D Zeng, J Wu, G Hattori, R Xu, Y Yu - ACM Transactions on Multimedia …, 2023 - dl.acm.org
Audio-visual tracks in video contain rich semantic information with potential in many
applications and research. Since the audio-visual data have inconsistent distributions and …