Sequence-to-Sequence Voice Conversion with Similarity Metric Learned Using Generative Adversarial...

B Sisman, J Yamagishi, S King… - IEEE/ACM Transactions …, 2020 - ieeexplore.ieee.org

Speaker identity is one of the important characteristics of human speech. In voice
conversion, we change the speaker identity from one to another, while keeping the linguistic …

被引用次数：417 相关文章所有 8 个版本

Generative adversarial networks for speech processing: A review

A Wali, Z Alamgir, S Karim, A Fawaz, MB Ali… - Computer Speech & …, 2022 - Elsevier

Generative adversarial networks (GANs) have seen remarkable progress in recent years.
They are used as generative models for all kinds of data such as text, images, audio, music …

被引用次数：63 相关文章所有 2 个版本

[PDF] arxiv.org

Stargan-vc: Non-parallel many-to-many voice conversion using star generative adversarial networks

H Kameoka, T Kaneko, K Tanaka… - 2018 IEEE Spoken …, 2018 - ieeexplore.ieee.org

This paper proposes a method that allows non-parallel many-to-many voice conversion (VC)
by using a variant of a generative adversarial network (GAN) called StarGAN. Our method …

被引用次数：504 相关文章所有 5 个版本

[PDF] arxiv.org

Cyclegan-vc2: Improved cyclegan-based non-parallel voice conversion

T Kaneko, H Kameoka, K Tanaka… - ICASSP 2019-2019 …, 2019 - ieeexplore.ieee.org

Non-parallel voice conversion (VC) is a technique for learning the mapping from source to
target speech without relying on parallel data. This is an important task, but it has been …

被引用次数：350 相关文章所有 7 个版本

[PDF] ntt.co.jp

Cyclegan-vc: Non-parallel voice conversion using cycle-consistent adversarial networks

T Kaneko, H Kameoka - 2018 26th European Signal …, 2018 - ieeexplore.ieee.org

We propose a non-parallel voice-conversion (VC) method that can learn a mapping from
source to target speech without relying on parallel data. The proposed method is particularly …

被引用次数：371 相关文章所有 8 个版本

[PDF] arxiv.org

Parallel-data-free voice conversion using cycle-consistent adversarial networks

T Kaneko, H Kameoka - arXiv preprint arXiv:1711.11293, 2017 - arxiv.org

We propose a parallel-data-free voice-conversion (VC) method that can learn a mapping
from source to target speech without relying on parallel data. The proposed method is …

被引用次数：277 相关文章所有 3 个版本

[PDF] semanticscholar.org

Time-frequency masking-based speech enhancement using generative adversarial network

MH Soni, N Shah, HA Patil - 2018 IEEE international …, 2018 - ieeexplore.ieee.org

The success of time-frequency (TF) mask-based approaches is dependent on the accuracy
of predicted mask given the noisy spectral features. The state-of-the-art methods in TF …

被引用次数：258 相关文章所有 6 个版本

[PDF] arxiv.org

Stargan-vc2: Rethinking conditional methods for stargan-based voice conversion

T Kaneko, H Kameoka, K Tanaka, N Hojo - arXiv preprint arXiv …, 2019 - arxiv.org

Non-parallel multi-domain voice conversion (VC) is a technique for learning mappings
among multiple domains without relying on parallel data. This is important but challenging …

被引用次数：183 相关文章所有 7 个版本

[PDF] arxiv.org

ACVAE-VC: Non-parallel voice conversion with auxiliary classifier variational autoencoder

H Kameoka, T Kaneko, K Tanaka… - IEEE/ACM Transactions …, 2019 - ieeexplore.ieee.org

This paper proposes a non-parallel voice conversion (VC) method using a variant of the
conditional variational autoencoder (VAE) called an auxiliary classifier VAE. The proposed …

被引用次数：186 相关文章所有 5 个版本

[PDF] arxiv.org

Sequence-to-sequence acoustic modeling for voice conversion

JX Zhang, ZH Ling, LJ Liu, Y Jiang… - IEEE/ACM Transactions …, 2019 - ieeexplore.ieee.org

In this paper, a neural network named sequence-to-sequence ConvErsion NeTwork
(SCENT) is presented for acoustic modeling in voice conversion. At training stage, a SCENT …

被引用次数：170 相关文章所有 5 个版本

高级搜索

QQ 群