Contrastive learning-based audio to lyrics alignment for multiple languages - 学术资源搜索

文章

学术资源搜索

获得 12 条结果（用时0.02秒）

我的图书馆

Contrastive learning-based audio to lyrics alignment for multiple languages

在引用文章中搜索

[PDF] arxiv.org

Foundation models for music: A survey

Y Ma, A Øland, A Ragni, BMS Del Sette, C Saitis… - arXiv preprint arXiv …, 2024 - arxiv.org

In recent years, foundation models (FMs) such as large language models (LLMs) and latent
diffusion models (LDMs) have profoundly impacted diverse sectors, including music. This …

被引用次数：6 相关文章所有 2 个版本

[PDF] neurips.cc

Marble: Music audio representation benchmark for universal evaluation

R Yuan, Y Ma, Y Li, G Zhang, X Chen… - Advances in …, 2023 - proceedings.neurips.cc

In the era of extensive intersection between art and Artificial Intelligence (AI), such as image
generation and fiction co-creation, AI for music remains relatively nascent, particularly in …

被引用次数：17 相关文章所有 7 个版本

[PDF] arxiv.org

Adapting pretrained speech model for mandarin lyrics transcription and alignment

JY Wang, CI Leong, YC Lin, L Su… - 2023 IEEE Automatic …, 2023 - ieeexplore.ieee.org

The tasks of automatic lyrics transcription and lyrics alignment have witnessed significant
performance improvements in the past few years. However, most of the previous works only …

被引用次数：4 相关文章所有 3 个版本

[PDF] arxiv.org

Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model

J Huang, E Benetos - arXiv preprint arXiv:2406.17618, 2024 - arxiv.org

Multilingual automatic lyrics transcription (ALT) is a challenging task due to the limited
availability of labelled data and the challenges introduced by singing, compared to …

被引用次数：1 相关文章

[PDF] arxiv.org

Roadmap towards Superhuman Speech Understanding using Large Language Models

F Bu, Y Zhang, X Wang, B Wang, Q Liu, H Li - arXiv preprint arXiv …, 2024 - arxiv.org

The success of large language models (LLMs) has prompted efforts to integrate speech and
audio data, aiming to create general foundation models capable of processing both textual …

相关文章所有 2 个版本

[PDF] arxiv.org

PolySinger: Singing-Voice to Singing-Voice Translation from English to Japanese

S Antonisen, I López-Espejo - arXiv preprint arXiv:2407.14399, 2024 - arxiv.org

The speech domain prevails in the spotlight for several natural language processing (NLP)
tasks while the singing domain remains less explored. The culmination of NLP is the speech …

相关文章所有 2 个版本

[PDF] arxiv.org

Lyrics Transcription for Humans: A Readability-Aware Benchmark

O Cífka, H Schreiber, L Miner, FR Stöter - arXiv preprint arXiv:2408.06370, 2024 - arxiv.org

Writing down lyrics for human consumption involves not only accurately capturing word
sequences, but also incorporating punctuation and formatting for clarity and to convey …

相关文章所有 2 个版本

[PDF] arxiv.org

A Real-Time Lyrics Alignment System Using Chroma and Phonetic Features for Classical Vocal Performance

J Park, S Yong, T Kwon, J Nam - ICASSP 2024-2024 IEEE …, 2024 - ieeexplore.ieee.org

The goal of real-time lyrics alignment is to take live singing audio as input and to pinpoint the
exact position within given lyrics on the fly. The task can benefit real-world applications such …

相关文章所有 3 个版本

[PDF] arxiv.org

Jam-ALT: A Formatting-Aware Lyrics Transcription Benchmark

O Cífka, C Dimitriou, C Wang, H Schreiber… - arXiv preprint arXiv …, 2023 - arxiv.org

Current automatic lyrics transcription (ALT) benchmarks focus exclusively on word content
and ignore the finer nuances of written lyrics including formatting and punctuation, which …

被引用次数：2 相关文章所有 4 个版本

MIR-MLPop: A Multilingual Pop Music Dataset with Time-Aligned Lyrics and Audio

JY Wang, CC Wang, CI Leong… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org

We introduce MIR-MLPop, a publicly available multilingual pop music dataset designed for
automatic lyrics transcription and lyrics alignment in polyphonic music. The dataset …

被引用次数：1 相关文章