H Du, X Tian, L Xie, H Li - 2021 IEEE Spoken language …, 2021 - ieeexplore.ieee.org
We propose a novel training scheme to optimize voice conversion network with a speaker identity loss function. The training scheme not only minimizes frame-level spectral loss, but …
R Tao, KA Lee, RK Das… - IEEE/ACM Transactions …, 2023 - ieeexplore.ieee.org
We study a novel neural speaker encoder and its training strategies for speaker recognition without using any identity labels. The speaker encoder is trained to extract a fixed …
This paper presents a cross-lingual voice conversion framework that adopts a modularized neural network. The modularized neural network has a common input structure that is …
Abstract Cross-Lingual Voice Conversion (XVC) aims to modify a source speaker identity towards a target while preserving the source linguistic content. This paper introduces a cycle …
H Du, L Xie - arXiv preprint arXiv:2106.10406, 2021 - arxiv.org
One-shot voice conversion has received significant attention since only one utterance from source speaker and target speaker respectively is required. Moreover, source speaker and …
We study a novel neural architecture and its training strategies of speaker encoder for speaker recognition without using any identity labels. The speaker encoder is trained to …
CJ Chang - arXiv preprint arXiv:2009.14668, 2020 - arxiv.org
Cross-lingual voice conversion (VC) is a task that aims to synthesize target voices with the same content while source and target speakers speak in different languages. Its challenge …
S Yan, S Chen, Y Xu, D Ke - International Conference on Artificial …, 2023 - Springer
Voice conversion aims to change the timber of the source speaker to that of the target speaker without changing the speech content. The cross-lingual voice conversion requires …
In our daily life, humans can recognize the person based on their facial and voice characteristics. Research in biology has proved that speech and face modalities can provide …