Any-to-many voice conversion with location-relative sequence-to-sequence modeling

A Mehrish, N Majumder, R Bharadwaj, R Mihalcea… - Information …, 2023 - Elsevier

The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …

被引用次数：86 相关文章所有 6 个版本

[PDF] mdpi.com

Overview of voice conversion methods based on deep learning

T Walczyna, Z Piotrowski - Applied Sciences, 2023 - mdpi.com

Voice conversion is a process where the essence of a speaker's identity is seamlessly
transferred to another speaker, all while preserving the content of their speech. This usage is …

被引用次数：17 相关文章所有 3 个版本

[PDF] arxiv.org

Diffusion-based voice conversion with fast maximum likelihood sampling scheme

V Popov, I Vovk, V Gogoryan, T Sadekova… - arXiv preprint arXiv …, 2021 - arxiv.org

Voice conversion is a common speech synthesis task which can be solved in different ways
depending on a particular real-world scenario. The most challenging one often referred to as …

被引用次数：73 相关文章所有 4 个版本

[PDF] arxiv.org

Freevc: Towards high-quality text-free one-shot voice conversion

J Li, W Tu, L Xiao - ICASSP 2023-2023 IEEE International …, 2023 - ieeexplore.ieee.org

Voice conversion (VC) can be achieved by first extracting source content information and
target speaker information, and then reconstructing waveform with these information …

被引用次数：42 相关文章所有 2 个版本

[PDF] ieee.org

Emotion intensity and its control for emotional voice conversion

K Zhou, B Sisman, R Rana… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org

Emotional voice conversion (EVC) seeks to convert the emotional state of an utterance while
preserving the linguistic content and speaker identity. In EVC, emotions are usually treated …

被引用次数：43 相关文章所有 7 个版本

[PDF] arxiv.org

Uniaudio: An audio foundation model toward universal audio generation

D Yang, J Tian, X Tan, R Huang, S Liu, X Chang… - arXiv preprint arXiv …, 2023 - arxiv.org

Language models (LMs) have demonstrated the capability to handle a variety of generative
tasks. This paper presents the UniAudio system, which, unlike prior task-specific …

被引用次数：38 相关文章所有 3 个版本

[PDF] arxiv.org

Make-a-voice: Unified voice synthesis with discrete representation

R Huang, C Zhang, Y Wang, D Yang, L Liu… - arXiv preprint arXiv …, 2023 - arxiv.org

Various applications of voice synthesis have been developed independently despite the fact
that they generate" voice" as output in common. In addition, the majority of voice synthesis …

被引用次数：20 相关文章所有 2 个版本

[PDF] arxiv.org

Drvc: A framework of any-to-any voice conversion with self-supervised learning

Q Wang, X Zhang, J Wang, N Cheng… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org

Any-to-any voice conversion problem aims to convert voices for source and target speakers,
which are out of the training data. Previous works wildly utilize the disentangle-based …

被引用次数：25 相关文章所有 5 个版本

[PDF] arxiv.org

Disentangling content and fine-grained prosody information via hybrid asr bottleneck features for voice conversion

X Zhao, F Liu, C Song, Z Wu, S Kang… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org

Non-parallel data voice conversion (VC) have achieved considerable breakthroughs
recently through introducing bottleneck features (BNFs) extracted by the automatic speech …

被引用次数：21 相关文章所有 4 个版本

[PDF] arxiv.org

Voice conversion with just nearest neighbors

M Baas, B van Niekerk, H Kamper - arXiv preprint arXiv:2305.18975, 2023 - arxiv.org

Any-to-any voice conversion aims to transform source speech into a target voice with just a
few examples of the target speaker as a reference. Recent methods produce convincing …

被引用次数：13 相关文章所有 6 个版本

高级搜索

QQ 群