相关文章- 学术资源搜索

Autovc: Zero-shot voice style transfer with only autoencoder loss

K Qian, Y Zhang, S Chang, X Yang… - International …, 2019 - proceedings.mlr.press

Despite the progress in voice conversion, many-to-many voice conversion trained on non-
parallel data, as well as zero-shot voice conversion, remains under-explored. Deep style …

被引用次数：477 相关文章所有 13 个版本

[PDF] arxiv.org

Improving zero-shot voice style transfer via disentangled representation learning

S Yuan, P Cheng, R Zhang, W Hao, Z Gan… - arXiv preprint arXiv …, 2021 - arxiv.org

Voice style transfer, also called voice conversion, seeks to modify one speaker's voice to
generate speech as if it came from another (target) speaker. Previous works have made …

被引用次数：66 相关文章所有 5 个版本

[PDF] arxiv.org

F0-consistent many-to-many non-parallel voice conversion via conditional autoencoder

K Qian, Z Jin, M Hasegawa-Johnson… - ICASSP 2020-2020 …, 2020 - ieeexplore.ieee.org

Non-parallel many-to-many voice conversion remains an interesting but challenging speech
processing task. Many style-transfer-inspired methods such as generative adversarial …

被引用次数：109 相关文章所有 6 个版本

[PDF] neurips.cc

Voicemixer: Adversarial voice style mixup

SH Lee, JH Kim, H Chung… - Advances in Neural …, 2021 - proceedings.neurips.cc

Although recent advances in voice conversion have shown significant improvement, there
still remains a gap between the converted voice and target voice. A key factor that maintains …

被引用次数：31 相关文章所有 7 个版本

[PDF] arxiv.org

Starganv2-vc: A diverse, unsupervised, non-parallel framework for natural-sounding voice conversion

YA Li, A Zare, N Mesgarani - arXiv preprint arXiv:2107.10394, 2021 - arxiv.org

We present an unsupervised non-parallel many-to-many voice conversion (VC) method
using a generative adversarial network (GAN) called StarGAN v2. Using a combination of …

被引用次数：76 相关文章所有 5 个版本

[PDF] arxiv.org

MelGAN-VC: Voice conversion and audio style transfer on arbitrarily long samples using spectrograms

M Pasini - arXiv preprint arXiv:1910.03713, 2019 - arxiv.org

Traditional voice conversion methods rely on parallel recordings of multiple speakers
pronouncing the same sentences. For real-world applications however, parallel data is …

被引用次数：47 相关文章所有 2 个版本

[PDF] arxiv.org

Parallel-data-free voice conversion using cycle-consistent adversarial networks

T Kaneko, H Kameoka - arXiv preprint arXiv:1711.11293, 2017 - arxiv.org

We propose a parallel-data-free voice-conversion (VC) method that can learn a mapping
from source to target speech without relying on parallel data. The proposed method is …

被引用次数：276 相关文章所有 3 个版本

[PDF] arxiv.org

Nvc-net: End-to-end adversarial voice conversion

B Nguyen, F Cardinaux - ICASSP 2022-2022 IEEE …, 2022 - ieeexplore.ieee.org

Voice conversion (VC) has gained increasing popularity in many speech synthesis
applications. The idea is to change the voice identity from one speaker into another while …

被引用次数：37 相关文章所有 3 个版本

[PDF] apsipa.org

SINGAN: Singing voice conversion with generative adversarial networks

B Sisman, K Vijayan, M Dong… - 2019 Asia-Pacific Signal …, 2019 - ieeexplore.ieee.org

Singing voice conversion (SVC) is a task to convert the source singer's voice to sound like
that of the target singer, without changing the lyrical content. So far, most of the voice …

被引用次数：45 相关文章所有 5 个版本

[PDF] ieee.org

Transfer learning from speech synthesis to voice conversion with non-parallel training data

M Zhang, Y Zhou, L Zhao, H Li - IEEE/ACM Transactions on …, 2021 - ieeexplore.ieee.org

We present a novel voice conversion (VC) framework by learning from a text-to-speech
(TTS) synthesis system, that is called TTS-VC transfer learning or TTL-VC for short. We first …

被引用次数：54 相关文章所有 6 个版本

高级搜索

QQ 群

Autovc: Zero-shot voice style transfer with only autoencoder loss

Improving zero-shot voice style transfer via disentangled representation learning

F0-consistent many-to-many non-parallel voice conversion via conditional autoencoder

Voicemixer: Adversarial voice style mixup

Starganv2-vc: A diverse, unsupervised, non-parallel framework for natural-sounding voice conversion

MelGAN-VC: Voice conversion and audio style transfer on arbitrarily long samples using spectrograms

Parallel-data-free voice conversion using cycle-consistent adversarial networks

Nvc-net: End-to-end adversarial voice conversion

SINGAN: Singing voice conversion with generative adversarial networks

Transfer learning from speech synthesis to voice conversion with non-parallel training data

相关搜索

引用