Language agnostic speaker embedding for cross-lingual personalized speech generation

Y Zhou, X Tian, H Li - IEEE/ACM Transactions on Audio …, 2021 - ieeexplore.ieee.org
Cross-lingual personalized speech generation seeks to synthesize a target speaker's voice
from only a few training samples that are in a different language. One popular technique is to …

MulliVC: Multi-lingual Voice Conversion With Cycle Consistency

J Huang, C Zhang, Y Ren, Z Jiang, Z Ye, J Liu… - arXiv preprint arXiv …, 2024 - arxiv.org
Voice conversion aims to modify the source speaker's voice to resemble the target speaker
while preserving the original speech content. Despite notable advancements in voice …

GlowVC: Mel-spectrogram space disentangling model for language-independent text-free voice conversion

M Proszewska, G Beringer, D Sáez-Trigueros… - arXiv preprint arXiv …, 2022 - arxiv.org
In this paper, we propose GlowVC: a multilingual multi-speaker flow-based model for
language-independent text-free voice conversion. We build on Glow-TTS, which provides an …

Optimization of cross-lingual voice conversion with linguistics losses to reduce foreign accents

Y Zhou, Z Wu, X Tian, H Li - IEEE/ACM Transactions on Audio …, 2023 - ieeexplore.ieee.org
Cross-lingual voice conversion (XVC) transforms the speaker identity of a source speaker to
that of a target speaker who speaks a different language. Due to the intrinsic differences …

Using joint training speaker encoder with consistency loss to achieve cross-lingual voice conversion and expressive voice conversion

H Guo, C Liu, CT Ishi, H Ishiguro - 2023 IEEE Automatic Speech …, 2023 - ieeexplore.ieee.org
Voice conversion systems have made significant advancements in terms of naturalness and
similarity in common voice conversion tasks. However, their performance in more complex …

Improvement Speaker Similarity for Zero-Shot Any-to-Any Voice Conversion of Whispered and Regular Speech

A Avdeeva, A Gusev - arXiv preprint arXiv:2408.11528, 2024 - arxiv.org
Zero-shot voice conversion aims to transfer the voice of a source speaker to that of a
speaker unseen during training, while preserving the content information. Although various …

Cross-lingual Text-To-Speech with Flow-based Voice Conversion for Improved Pronunciation

N Ellinas, G Vamvoukakis, K Markopoulos… - arXiv preprint arXiv …, 2022 - arxiv.org
This paper presents a method for end-to-end cross-lingual text-to-speech (TTS) which aims
to preserve the target language's pronunciation regardless of the original speaker's …

Cross-Lingual Voice Conversion

Y Zhou - 2022 - search.proquest.com
Abstract Cross-Lingual Voice Conversion (XVC) aims to change the identity of a source
speaker towards a target speaker while preserving the content. In particular, the source and …