Leveraging Diverse Semantic-based Audio Pretrained Models for Singing Voice Conversion

X Zhang, L Xue, Y Gu, Y Wang, J Li, H He… - arXiv preprint arXiv …, 2023 - arxiv.org

Amphion is an open-source toolkit for Audio, Music, and Speech Generation, targeting to
ease the way for junior researchers and engineers into these fields. It presents a unified …

被引用次数：21 相关文章所有 2 个版本

[PDF] arxiv.org

CoMoSVC: Consistency Model-based Singing Voice Conversion

Y Lu, Z Ye, W Xue, X Tan, Q Liu, Y Guo - arXiv preprint arXiv:2401.01792, 2024 - arxiv.org

The diffusion-based Singing Voice Conversion (SVC) methods have achieved remarkable
performances, producing natural audios with high similarity to the target timbre. However …

被引用次数：8 相关文章所有 2 个版本

[PDF] arxiv.org

SingVisio: Visual Analytics of Diffusion Model for Singing Voice Conversion

L Xue, C Wang, M Wang, X Zhang, J Han… - arXiv preprint arXiv …, 2024 - arxiv.org

In this study, we present SingVisio, an interactive visual analysis system that aims to explain
the diffusion model used in singing voice conversion. SingVisio provides a visual display of …

被引用次数：3 相关文章所有 2 个版本

[PDF] arxiv.org

HybridVC: Efficient Voice Style Conversion with Text and Audio Prompts

X Niu, J Zhang, CP Martin - arXiv preprint arXiv:2404.15637, 2024 - arxiv.org

We introduce HybridVC, a voice conversion (VC) framework built upon a pre-trained
conditional variational autoencoder (CVAE) that combines the strengths of a latent model …

被引用次数：1 相关文章所有 3 个版本

[PDF] arxiv.org

高级搜索

QQ 群