Cyclegan-vc2: Improved cyclegan-based non-parallel voice conversion

A Mehrish, N Majumder, R Bharadwaj, R Mihalcea… - Information …, 2023 - Elsevier

The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …

被引用次数：86 相关文章所有 6 个版本

[PDF] ieee.org

An overview of voice conversion and its challenges: From statistical modeling to deep learning

B Sisman, J Yamagishi, S King… - IEEE/ACM Transactions …, 2020 - ieeexplore.ieee.org

Speaker identity is one of the important characteristics of human speech. In voice
conversion, we change the speaker identity from one to another, while keeping the linguistic …

被引用次数：318 相关文章所有 9 个版本

[PDF] arxiv.org

Deepfakes generation and detection: State-of-the-art, open challenges, countermeasures, and way forward

M Masood, M Nawaz, KM Malik, A Javed, A Irtaza… - Applied …, 2023 - Springer

Easy access to audio-visual content on social media, combined with the availability of
modern tools such as Tensorflow or Keras, and open-source trained models, along with …

被引用次数：253 相关文章所有 10 个版本

[PDF] mlr.press

Contentvec: An improved self-supervised speech representation by disentangling speakers

K Qian, Y Zhang, H Gao, J Ni, CI Lai… - International …, 2022 - proceedings.mlr.press

Self-supervised learning in speech involves training a speech representation network on a
large-scale unannotated speech corpus, and then applying the learned representations to …

被引用次数：70 相关文章所有 9 个版本

[PDF] arxiv.org

Stargan-vc2: Rethinking conditional methods for stargan-based voice conversion

T Kaneko, H Kameoka, K Tanaka, N Hojo - arXiv preprint arXiv …, 2019 - arxiv.org

Non-parallel multi-domain voice conversion (VC) is a technique for learning mappings
among multiple domains without relying on parallel data. This is important but challenging …

被引用次数：160 相关文章所有 8 个版本

[PDF] arxiv.org

Again-vc: A one-shot voice conversion using activation guidance and adaptive instance normalization

YH Chen, DY Wu, TH Wu, H Lee - ICASSP 2021-2021 IEEE …, 2021 - ieeexplore.ieee.org

Recently, voice conversion (VC) has been widely studied. Many VC systems use
disentangle-based learning techniques to separate the speaker and the linguistic content …

被引用次数：95 相关文章所有 4 个版本

[PDF] arxiv.org

Cyclegan-vc3: Examining and improving cyclegan-vcs for mel-spectrogram conversion

T Kaneko, H Kameoka, K Tanaka, N Hojo - arXiv preprint arXiv …, 2020 - arxiv.org

Non-parallel voice conversion (VC) is a technique for learning mappings between source
and target speeches without using a parallel corpus. Recently, cycle-consistent adversarial …

被引用次数：90 相关文章所有 9 个版本

[PDF] arxiv.org

Non-parallel sequence-to-sequence voice conversion with disentangled linguistic and speaker representations

JX Zhang, ZH Ling, LR Dai - IEEE/ACM Transactions on Audio …, 2019 - ieeexplore.ieee.org

This article presents a method of sequence-to-sequence (seq2seq) voice conversion using
non-parallel training data. In this method, disentangled linguistic and speaker …

被引用次数：124 相关文章所有 5 个版本

[HTML] mdpi.com

[HTML][HTML] A review of synthetic image data and its use in computer vision

K Man, J Chahl - Journal of Imaging, 2022 - mdpi.com

Development of computer vision algorithms using convolutional neural networks and deep
learning has necessitated ever greater amounts of annotated and labelled data to produce …

被引用次数：37 相关文章所有 6 个版本

[PDF] neurips.cc

Voicemixer: Adversarial voice style mixup

SH Lee, JH Kim, H Chung… - Advances in Neural …, 2021 - proceedings.neurips.cc

Although recent advances in voice conversion have shown significant improvement, there
still remains a gap between the converted voice and target voice. A key factor that maintains …

被引用次数：31 相关文章所有 7 个版本

高级搜索

QQ 群