Generative adversarial networks (GANs) have seen remarkable progress in recent years. They are used as generative models for all kinds of data such as text, images, audio, music …
This paper proposes a method that allows non-parallel many-to-many voice conversion (VC) by using a variant of a generative adversarial network (GAN) called StarGAN. Our method …
Non-parallel voice conversion (VC) is a technique for learning the mapping from source to target speech without relying on parallel data. This is an important task, but it has been …
T Kaneko, H Kameoka - 2018 26th European Signal …, 2018 - ieeexplore.ieee.org
We propose a non-parallel voice-conversion (VC) method that can learn a mapping from source to target speech without relying on parallel data. The proposed method is particularly …
T Kaneko, H Kameoka - arXiv preprint arXiv:1711.11293, 2017 - arxiv.org
We propose a parallel-data-free voice-conversion (VC) method that can learn a mapping from source to target speech without relying on parallel data. The proposed method is …
The success of time-frequency (TF) mask-based approaches is dependent on the accuracy of predicted mask given the noisy spectral features. The state-of-the-art methods in TF …
Non-parallel multi-domain voice conversion (VC) is a technique for learning mappings among multiple domains without relying on parallel data. This is important but challenging …
Abstract Accurate estimation of Room Impulse Response (RIR) which captures an environment's acoustic properties is important for speech processing and AR/VR …
We propose a training framework for sequence-to-sequence voice conversion (SVC). A well- known problem regarding a conventional VC framework is that acoustic-feature sequences …