High-quality nonparallel voice conversion based on cycle-consistent adversarial network

F Fang, J Yamagishi, I Echizen… - … on Acoustics, Speech …, 2018 - ieeexplore.ieee.org
2018 IEEE International Conference on Acoustics, Speech and Signal …, 2018ieeexplore.ieee.org
Although voice conversion (VC) algorithms have achieved remarkable success along with
the development of machine learning, superior performance is still difficult to achieve when
using nonparallel data. In this paper, we propose using a cycle-consistent adversarial
network (CycleGAN) for nonparallel data-based VC training. A CycleGAN is a generative
adversarial network (GAN) originally developed for unpaired image-to-image translation. A
subjective evaluation of inter-gender conversion demonstrated that the proposed method …
Although voice conversion (VC) algorithms have achieved remarkable success along with the development of machine learning, superior performance is still difficult to achieve when using nonparallel data. In this paper, we propose using a cycle-consistent adversarial network (CycleGAN) for nonparallel data-based VC training. A CycleGAN is a generative adversarial network (GAN) originally developed for unpaired image-to-image translation. A subjective evaluation of inter-gender conversion demonstrated that the proposed method significantly outperformed a method based on the Merlin open source neural network speech synthesis system (a parallel VC system adapted for our setup) and a GAN-based parallel VC system. This is the first research to show that the performance of a nonparallel VC method can exceed that of state-of-the-art parallel VC methods.
ieeexplore.ieee.org
以上显示的是最相近的搜索结果。 查看全部搜索结果