Personalized Singing Voice Generation Using WaveRNN.

T Liu, KA Lee, Q Wang, H Li - Advances in Neural …, 2023 - proceedings.neurips.cc

For speaker recognition, it is difficult to extract an accurate speaker representation from
speech because of its mixture of speaker traits and content. This paper proposes a …

被引用次数：30 相关文章所有 9 个版本

[PDF] arxiv.org

Unsupervised cross-domain singing voice conversion

A Polyak, L Wolf, Y Adi, Y Taigman - arXiv preprint arXiv:2008.02830, 2020 - arxiv.org

We present a wav-to-wav generative model for the task of singing voice conversion from any
identity. Our method utilizes both an acoustic model, trained for the task of automatic speech …

被引用次数：54 相关文章所有 8 个版本

[PDF] arxiv.org

Genre-conditioned acoustic models for automatic lyrics transcription of polyphonic music

X Gao, C Gupta, H Li - ICASSP 2022-2022 IEEE International …, 2022 - ieeexplore.ieee.org

Lyrics transcription of polyphonic music is challenging not only because the singing vocals
are corrupted by the background music, but also because the background music and the …

被引用次数：24 相关文章所有 6 个版本

[PDF] ieee.org

Polyscriber: Integrated fine-tuning of extractor and lyrics transcriber for polyphonic music

X Gao, C Gupta, H Li - IEEE/ACM Transactions on Audio …, 2023 - ieeexplore.ieee.org

Lyrics transcription of polyphonic music is challenging as the background music affects lyrics
intelligibility. Typically, lyrics transcription can be performed by a two-step pipeline, ie a …

被引用次数：12 相关文章所有 4 个版本

[PDF] arxiv.org

Self-transriber: Few-shot lyrics transcription with self-training

X Gao, X Yue, H Li - ICASSP 2023-2023 IEEE International …, 2023 - ieeexplore.ieee.org

The current lyrics transcription approaches heavily rely on supervised learning with labeled
data, but such data are scarce and manual labeling of singing is expensive. How to benefit …

被引用次数：8 相关文章所有 3 个版本

[PDF] arxiv.org

Phonetic posteriorgrams based many-to-many singing voice conversion via adversarial training

H Guo, H Lu, N Hu, C Zhang, S Yang, L Xie… - arXiv preprint arXiv …, 2020 - arxiv.org

This paper describes an end-to-end adversarial singing voice conversion (EA-SVC)
approach. It can directly generate arbitrary singing waveform by given phonetic …

被引用次数：12 相关文章所有 2 个版本

[PDF] arxiv.org

Music-robust automatic lyrics transcription of polyphonic music

X Gao, C Gupta, H Li - arXiv preprint arXiv:2204.03306, 2022 - arxiv.org

Lyrics transcription of polyphonic music is challenging because singing vocals are corrupted
by the background music. To improve the robustness of lyrics transcription to the …

被引用次数：8 相关文章所有 5 个版本

[PDF] arxiv.org

Singing voice synthesis with vibrato modeling and latent energy representation

Y Song, W Song, W Zhang, Z Zhang… - 2022 IEEE 24th …, 2022 - ieeexplore.ieee.org

This paper proposes an expressive singing voice synthesis system by introducing explicit
vibrato modeling and latent energy representation. Vibrato is essential to the naturalness of …

被引用次数：5 相关文章所有 3 个版本

[PDF] arxiv.org

Self-Supervised Training of Speaker Encoder with Multi-Modal Diverse Positive Pairs

R Tao, KA Lee, RK Das, V Hautamäki, H Li - arXiv preprint arXiv …, 2022 - arxiv.org

We study a novel neural architecture and its training strategies of speaker encoder for
speaker recognition without using any identity labels. The speaker encoder is trained to …

被引用次数：2 相关文章所有 2 个版本

Automatic lyrics transcription of polyphonic music

X Gao - 2022 - search.proquest.com

Abstract Automatic Lyrics Transcription of polyphonic music (ALTP) aims to recognize the
sung lyrics from singing vocals in the presence of instrumental music accompaniment, and it …

被引用次数：1 相关文章所有 2 个版本

高级搜索

QQ 群