In recent years, foundation models (FMs) such as large language models (LLMs) and latent diffusion models (LDMs) have profoundly impacted diverse sectors, including music. This …
X Gao, C Gupta, H Li - IEEE/ACM Transactions on Audio …, 2022 - ieeexplore.ieee.org
Lyrics are the words that make up a song, while chords are harmonic sets of multiple notes in music. Lyrics and chords are generally essential information in music, ie unaccompanied …
This article presents a multimodal dataset comprising various representations and annotations of Franz Schubert's song cycle Winterreise. Schubert's seminal work constitutes …
C Gupta, E Yılmaz, H Li - ICASSP 2020-2020 IEEE …, 2020 - ieeexplore.ieee.org
Automatic lyrics alignment and transcription in polyphonic music are challenging tasks because the singing vocals are corrupted by the background music. In this work, we propose …
We introduce LyricWhiz, a robust, multilingual, and zero-shot automatic lyrics transcription method achieving state-of-the-art performance on various lyrics transcription datasets, even …
L Ou, X Gu, Y Wang - arXiv preprint arXiv:2207.09747, 2022 - arxiv.org
Automatic speech recognition (ASR) has progressed significantly in recent years due to the emergence of large-scale datasets and the self-supervised learning (SSL) paradigm …
The goal of singing voice separation is to recover the vocals signal from music mixtures. State-of-the-art performance is achieved by deep neural networks trained in a supervised …
X Gao, C Gupta, H Li - ICASSP 2022-2022 IEEE International …, 2022 - ieeexplore.ieee.org
Lyrics transcription of polyphonic music is challenging not only because the singing vocals are corrupted by the background music, but also because the background music and the …
C Gupta, H Li, M Goto - IEEE/ACM Transactions on Audio …, 2022 - ieeexplore.ieee.org
Singing, the vocal productionof musical tones, is one of the most important elements of music. Addressing the needs of real-world applications, the study of technologies related to …