Stargan-vc2: Rethinking conditional methods for stargan-based voice conversion

A Firc, K Malinka, P Hanáček - Heliyon, 2023 - cell.com

Deepfakes present an emerging threat in cyberspace. Recent developments in machine
learning make deepfakes highly believable, and very difficult to differentiate between what is …

被引用次数：30 相关文章所有 7 个版本

[PDF] mlr.press

Contentvec: An improved self-supervised speech representation by disentangling speakers

K Qian, Y Zhang, H Gao, J Ni, CI Lai… - International …, 2022 - proceedings.mlr.press

Self-supervised learning in speech involves training a speech representation network on a
large-scale unannotated speech corpus, and then applying the learned representations to …

被引用次数：121 相关文章所有 8 个版本

[PDF] sciencedirect.com

Emotional voice conversion: Theory, databases and ESD

K Zhou, B Sisman, R Liu, H Li - Speech Communication, 2022 - Elsevier

In this paper, we first provide a review of the state-of-the-art emotional voice conversion
research, and the existing emotional speech databases. We then motivate the development …

被引用次数：180 相关文章所有 7 个版本

[PDF] mlr.press

Unsupervised speech decomposition via triple information bottleneck

K Qian, Y Zhang, S Chang… - International …, 2020 - proceedings.mlr.press

Speech information can be roughly decomposed into four components: language content,
timbre, pitch, and rhythm. Obtaining disentangled representations of these components is …

被引用次数：215 相关文章所有 9 个版本

[PDF] arxiv.org

Again-vc: A one-shot voice conversion using activation guidance and adaptive instance normalization

YH Chen, DY Wu, TH Wu, H Lee - ICASSP 2021-2021 IEEE …, 2021 - ieeexplore.ieee.org

Recently, voice conversion (VC) has been widely studied. Many VC systems use
disentangle-based learning techniques to separate the speaker and the linguistic content …

被引用次数：125 相关文章所有 3 个版本

[PDF] arxiv.org

Starganv2-vc: A diverse, unsupervised, non-parallel framework for natural-sounding voice conversion

YA Li, A Zare, N Mesgarani - arXiv preprint arXiv:2107.10394, 2021 - arxiv.org

We present an unsupervised non-parallel many-to-many voice conversion (VC) method
using a generative adversarial network (GAN) called StarGAN v2. Using a combination of …

被引用次数：109 相关文章所有 5 个版本

[PDF] arxiv.org

Cyclegan-vc3: Examining and improving cyclegan-vcs for mel-spectrogram conversion

T Kaneko, H Kameoka, K Tanaka, N Hojo - arXiv preprint arXiv …, 2020 - arxiv.org

Non-parallel voice conversion (VC) is a technique for learning mappings between source
and target speeches without using a parallel corpus. Recently, cycle-consistent adversarial …

被引用次数：111 相关文章所有 8 个版本

[PDF] acm.org

Towards relatable explainable AI with the perceptual process

W Zhang, BY Lim - Proceedings of the 2022 CHI Conference on Human …, 2022 - dl.acm.org

Machine learning models need to provide contrastive explanations, since people often seek
to understand why a puzzling prediction occurred instead of some expected outcome …

被引用次数：61 相关文章所有 4 个版本

[PDF] arxiv.org

Avqvc: One-shot voice conversion by vector quantization with applying contrastive learning

H Tang, X Zhang, J Wang, N Cheng… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org

Voice Conversion (VC) refers to changing the timbre of a speech while retaining the
discourse content. Recently, many works have focused on disentangle-based learning …

被引用次数：54 相关文章所有 3 个版本

[PDF] arxiv.org

Maskcyclegan-vc: Learning non-parallel voice conversion with filling in frames

T Kaneko, H Kameoka, K Tanaka… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org

Non-parallel voice conversion (VC) is a technique for training voice converters without a
parallel corpus. Cycle-consistent adversarial network-based VCs (CycleGAN-VC and …

被引用次数：74 相关文章所有 4 个版本

高级搜索

QQ 群