F0-consistent many-to-many non-parallel voice conversion via conditional autoencoder

K Qian, Z Jin, M Hasegawa-Johnson… - ICASSP 2020-2020 …, 2020 - ieeexplore.ieee.org
Non-parallel many-to-many voice conversion remains an interesting but challenging speech
processing task. Many style-transfer-inspired methods such as generative adversarial …

Speech representation disentanglement with adversarial mutual information learning for one-shot voice conversion

SC Yang, M Tantrawenith, H Zhuang, Z Wu… - arXiv preprint arXiv …, 2022 - arxiv.org
One-shot voice conversion (VC) with only a single target speaker's speech for reference has
become a hot research topic. Existing works generally disentangle timbre, while information …

Many-to-many voice conversion using conditional cycle-consistent adversarial networks

S Lee, BG Ko, K Lee, IC Yoo… - ICASSP 2020-2020 IEEE …, 2020 - ieeexplore.ieee.org
Voice conversion (VC) refers to transforming the speaker characteristics of an utterance
without altering its linguistic contents. Many works on voice conversion require to have …

Stargan-vc: Non-parallel many-to-many voice conversion using star generative adversarial networks

H Kameoka, T Kaneko, K Tanaka… - 2018 IEEE Spoken …, 2018 - ieeexplore.ieee.org
This paper proposes a method that allows non-parallel many-to-many voice conversion (VC)
by using a variant of a generative adversarial network (GAN) called StarGAN. Our method …

Fragmentvc: Any-to-any voice conversion by end-to-end extracting and fusing fine-grained voice fragments with attention

YY Lin, CM Chien, JH Lin, H Lee… - ICASSP 2021-2021 IEEE …, 2021 - ieeexplore.ieee.org
Any-to-any voice conversion aims to convert the voice from and to any speakers even
unseen during training, which is much more challenging compared to one-to-one or many-to …

Nvc-net: End-to-end adversarial voice conversion

B Nguyen, F Cardinaux - ICASSP 2022-2022 IEEE …, 2022 - ieeexplore.ieee.org
Voice conversion (VC) has gained increasing popularity in many speech synthesis
applications. The idea is to change the voice identity from one speaker into another while …

Any-to-one sequence-to-sequence voice conversion using self-supervised discrete speech representations

WC Huang, YC Wu, T Hayashi - ICASSP 2021-2021 IEEE …, 2021 - ieeexplore.ieee.org
We present a novel approach to any-to-one (A2O) voice conversion (VC) in a sequence-to-
sequence (seq2seq) framework. A2O VC aims to convert any speaker, including those …

Vqmivc: Vector quantization and mutual information-based unsupervised speech representation disentanglement for one-shot voice conversion

D Wang, L Deng, YT Yeung, X Chen, X Liu… - arXiv preprint arXiv …, 2021 - arxiv.org
One-shot voice conversion (VC), which performs conversion across arbitrary speakers with
only a single target-speaker utterance for reference, can be effectively achieved by speech …

Starganv2-vc: A diverse, unsupervised, non-parallel framework for natural-sounding voice conversion

YA Li, A Zare, N Mesgarani - arXiv preprint arXiv:2107.10394, 2021 - arxiv.org
We present an unsupervised non-parallel many-to-many voice conversion (VC) method
using a generative adversarial network (GAN) called StarGAN v2. Using a combination of …

ConvS2S-VC: Fully convolutional sequence-to-sequence voice conversion

H Kameoka, K Tanaka, D Kwaśny… - … on audio, speech …, 2020 - ieeexplore.ieee.org
This article proposes a voice conversion (VC) method using sequence-to-sequence
(seq2seq or S2S) learning, which flexibly converts not only the voice characteristics but also …