Cross-modal music retrieval and applications: An overview of key methodologies

Y Ma, A Øland, A Ragni, BMS Del Sette, C Saitis… - arXiv preprint arXiv …, 2024 - arxiv.org

In recent years, foundation models (FMs) such as large language models (LLMs) and latent
diffusion models (LDMs) have profoundly impacted diverse sectors, including music. This …

被引用次数：10 相关文章所有 4 个版本

[PDF] arxiv.org

Contrastive audio-language learning for music

I Manco, E Benetos, E Quinton, G Fazekas - arXiv preprint arXiv …, 2022 - arxiv.org

As one of the most intuitive interfaces known to humans, natural language has the potential
to mediate many tasks that involve human-computer interaction, especially in application …

被引用次数：52 相关文章所有 8 个版本

[PDF] springer.com

MSCCov19Net: multi-branch deep learning model for COVID-19 detection from cough sounds

S Ulukaya, AA Sarıca, O Erdem, A Karaali - Medical & Biological …, 2023 - Springer

Coronavirus has an impact on millions of lives and has been added to the important
pandemics that continue to affect with its variants. Since it is transmitted through the …

被引用次数：24 相关文章所有 6 个版本

[PDF] arxiv.org

Multimodal music information processing and retrieval: Survey and future challenges

F Simonetta, S Ntalampiras… - … workshop on multilayer …, 2019 - ieeexplore.ieee.org

Towards improving the performance in various music information processing tasks, recent
studies exploit different modalities able to capture diverse aspects of music. Such modalities …

被引用次数：84 相关文章所有 10 个版本

[PDF] arxiv.org

Audio-based musical version identification: Elements and challenges

F Yesiler, G Doras, RM Bittner… - IEEE Signal …, 2021 - ieeexplore.ieee.org

Creating novel interpretations of existing musical compositions is and has always been an
essential part of musical practice. Before the advent of recorded music, listening to a piece of …

被引用次数：11 相关文章所有 3 个版本

How blind and visually impaired composers, producers, and songwriters leverage and adapt music technology

WC Payne, AY Xu, F Ahmed, L Ye, A Hurst - Proceedings of the 22nd …, 2020 - dl.acm.org

Today, music creation software and hardware are central to the workflow of most
professional composers, producers, and songwriters. Music is an aural art form, but it is …

被引用次数：35 相关文章所有 2 个版本

[PDF] academia.edu

[PDF][PDF] Query by Video: Cross-modal Music Retrieval.

B Li, A Kumar - ISMIR, 2019 - academia.edu

Cross-modal retrieval learns the relationship between the two types of data in a common
space so that an input from one modality can retrieve data from a different modality. We …

被引用次数：49 相关文章所有 2 个版本

[PDF] semanticscholar.org

[PDF][PDF] Erkomaishvili Dataset: A Curated Corpus of Traditional Georgian Vocal Music for Computational Musicology.

S Rosenzweig, F Scherbaum… - Trans. Int. Soc …, 2020 - pdfs.semanticscholar.org

The analysis of recorded audio material using computational methods has received
increased attention in ethnomusicological research. We present a curated dataset of …

被引用次数：37 相关文章所有 10 个版本

[PDF] arxiv.org

Discover: Disentangled music representation learning for cover song identification

J Xun, S Zhang, Y Yang, J Zhu, L Deng… - Proceedings of the 46th …, 2023 - dl.acm.org

In the field of music information retrieval (MIR), cover song identification (CSI) is a
challenging task that aims to identify cover versions of a query song from a massive …

被引用次数：4 相关文章所有 4 个版本

[PDF] researchgate.net

Learning explicit and implicit dual common subspaces for audio-visual cross-modal retrieval

D Zeng, J Wu, G Hattori, R Xu, Y Yu - ACM Transactions on Multimedia …, 2023 - dl.acm.org

Audio-visual tracks in video contain rich semantic information with potential in many
applications and research. Since the audio-visual data have inconsistent distributions and …

被引用次数：11 相关文章所有 3 个版本

高级搜索

QQ 群