Automatic lyrics transcription of polyphonic music with lyrics-chord multi-task learning

X Gao, C Gupta, H Li - IEEE/ACM Transactions on Audio …, 2022 - ieeexplore.ieee.org
Lyrics are the words that make up a song, while chords are harmonic sets of multiple notes
in music. Lyrics and chords are generally essential information in music, ie unaccompanied …

NHSS: A speech and singing parallel database

B Sharma, X Gao, K Vijayan, X Tian, H Li - Speech Communication, 2021 - Elsevier
We present a database of parallel recordings of speech and singing, collected and released
by the Human Language Technology (HLT) laboratory at the National University of …

Genre-conditioned acoustic models for automatic lyrics transcription of polyphonic music

X Gao, C Gupta, H Li - ICASSP 2022-2022 IEEE International …, 2022 - ieeexplore.ieee.org
Lyrics transcription of polyphonic music is challenging not only because the singing vocals
are corrupted by the background music, but also because the background music and the …

Deep learning approaches in topics of singing information processing

C Gupta, H Li, M Goto - IEEE/ACM Transactions on Audio …, 2022 - ieeexplore.ieee.org
Singing, the vocal productionof musical tones, is one of the most important elements of
music. Addressing the needs of real-world applications, the study of technologies related to …

Polyscriber: Integrated fine-tuning of extractor and lyrics transcriber for polyphonic music

X Gao, C Gupta, H Li - IEEE/ACM Transactions on Audio …, 2023 - ieeexplore.ieee.org
Lyrics transcription of polyphonic music is challenging as the background music affects lyrics
intelligibility. Typically, lyrics transcription can be performed by a two-step pipeline, ie a …

Self-transriber: Few-shot lyrics transcription with self-training

X Gao, X Yue, H Li - ICASSP 2023-2023 IEEE International …, 2023 - ieeexplore.ieee.org
The current lyrics transcription approaches heavily rely on supervised learning with labeled
data, but such data are scarce and manual labeling of singing is expensive. How to benefit …

Speech-to-singing conversion in an encoder-decoder framework

J Parekh, P Rao, YH Yang - ICASSP 2020-2020 IEEE …, 2020 - ieeexplore.ieee.org
In this paper our goal is to convert a set of spoken lines into sung ones. Unlike previous
signal processing based methods, we take a learning based approach to the problem. This …

Music-robust automatic lyrics transcription of polyphonic music

X Gao, C Gupta, H Li - arXiv preprint arXiv:2204.03306, 2022 - arxiv.org
Lyrics transcription of polyphonic music is challenging because singing vocals are corrupted
by the background music. To improve the robustness of lyrics transcription to the …

A modularized neural network with language-specific output layers for cross-lingual voice conversion

Y Zhou, X Tian, E Yılmaz, RK Das… - 2019 IEEE Automatic …, 2019 - ieeexplore.ieee.org
This paper presents a cross-lingual voice conversion framework that adopts a modularized
neural network. The modularized neural network has a common input structure that is …

Speech-to-singing conversion based on boundary equilibrium GAN

DY Wu, YH Yang - arXiv preprint arXiv:2005.13835, 2020 - arxiv.org
This paper investigates the use of generative adversarial network (GAN)-based models for
converting the spectrogram of a speech signal into that of a singing one, without reference to …