Multi-target voice conversion without parallel data by adversarially learning disentangled audio representations

J Chou, C Yeh, H Lee, L Lee - arXiv preprint arXiv:1804.02812, 2018 - arxiv.org
Recently, cycle-consistent adversarial network (Cycle-GAN) has been successfully applied
to voice conversion to a different speaker without parallel data, although in those …

Voice conversion from unaligned corpora using variational autoencoding wasserstein generative adversarial networks

CC Hsu, HT Hwang, YC Wu, Y Tsao… - arXiv preprint arXiv …, 2017 - arxiv.org
Building a voice conversion (VC) system from non-parallel speech corpora is challenging
but highly valuable in real application scenarios. In most situations, the source and the target …

Stargan-vc: Non-parallel many-to-many voice conversion using star generative adversarial networks

H Kameoka, T Kaneko, K Tanaka… - 2018 IEEE Spoken …, 2018 - ieeexplore.ieee.org
This paper proposes a method that allows non-parallel many-to-many voice conversion (VC)
by using a variant of a generative adversarial network (GAN) called StarGAN. Our method …

Speech representation disentanglement with adversarial mutual information learning for one-shot voice conversion

SC Yang, M Tantrawenith, H Zhuang, Z Wu… - arXiv preprint arXiv …, 2022 - arxiv.org
One-shot voice conversion (VC) with only a single target speaker's speech for reference has
become a hot research topic. Existing works generally disentangle timbre, while information …

[PDF][PDF] Sequence-to-Sequence Voice Conversion with Similarity Metric Learned Using Generative Adversarial Networks.

T Kaneko, H Kameoka, K Hiramatsu, K Kashino - Interspeech, 2017 - kecl.ntt.co.jp
We propose a training framework for sequence-to-sequence voice conversion (SVC). A well-
known problem regarding a conventional VC framework is that acoustic-feature sequences …

S2VC: A framework for any-to-any voice conversion with self-supervised pretrained representations

J Lin, YY Lin, CM Chien, H Lee - arXiv preprint arXiv:2104.02901, 2021 - arxiv.org
Any-to-any voice conversion (VC) aims to convert the timbre of utterances from and to any
speakers seen or unseen during training. Various any-to-any VC approaches have been …

Cyclegan-vc2: Improved cyclegan-based non-parallel voice conversion

T Kaneko, H Kameoka, K Tanaka… - ICASSP 2019-2019 …, 2019 - ieeexplore.ieee.org
Non-parallel voice conversion (VC) is a technique for learning the mapping from source to
target speech without relying on parallel data. This is an important task, but it has been …

High-quality nonparallel voice conversion based on cycle-consistent adversarial network

F Fang, J Yamagishi, I Echizen… - … on acoustics, speech …, 2018 - ieeexplore.ieee.org
Although voice conversion (VC) algorithms have achieved remarkable success along with
the development of machine learning, superior performance is still difficult to achieve when …

ConvS2S-VC: Fully convolutional sequence-to-sequence voice conversion

H Kameoka, K Tanaka, D Kwaśny… - … on audio, speech …, 2020 - ieeexplore.ieee.org
This article proposes a voice conversion (VC) method using sequence-to-sequence
(seq2seq or S2S) learning, which flexibly converts not only the voice characteristics but also …

Cyclegan-vc: Non-parallel voice conversion using cycle-consistent adversarial networks

T Kaneko, H Kameoka - 2018 26th European Signal …, 2018 - ieeexplore.ieee.org
We propose a non-parallel voice-conversion (VC) method that can learn a mapping from
source to target speech without relying on parallel data. The proposed method is particularly …