High-quality nonparallel voice conversion based on cycle-consistent adversarial network

J Gui, Z Sun, Y Wen, D Tao, J Ye - IEEE transactions on …, 2021 - ieeexplore.ieee.org

Generative adversarial networks (GANs) have recently become a hot research topic;
however, they have been studied since 2014, and a large number of algorithms have been …

被引用次数：920 相关文章所有 13 个版本

[PDF] ieee.org

An overview of voice conversion and its challenges: From statistical modeling to deep learning

B Sisman, J Yamagishi, S King… - IEEE/ACM Transactions …, 2020 - ieeexplore.ieee.org

Speaker identity is one of the important characteristics of human speech. In voice
conversion, we change the speaker identity from one to another, while keeping the linguistic …

被引用次数：313 相关文章所有 9 个版本

[PDF] arxiv.org

Deepfakes generation and detection: State-of-the-art, open challenges, countermeasures, and way forward

M Masood, M Nawaz, KM Malik, A Javed, A Irtaza… - Applied …, 2023 - Springer

Easy access to audio-visual content on social media, combined with the availability of
modern tools such as Tensorflow or Keras, and open-source trained models, along with …

被引用次数：249 相关文章所有 10 个版本

[PDF] mlr.press

Autovc: Zero-shot voice style transfer with only autoencoder loss

K Qian, Y Zhang, S Chang, X Yang… - International …, 2019 - proceedings.mlr.press

Despite the progress in voice conversion, many-to-many voice conversion trained on non-
parallel data, as well as zero-shot voice conversion, remains under-explored. Deep style …

被引用次数：475 相关文章所有 13 个版本

[PDF] arxiv.org

Cyclegan-vc2: Improved cyclegan-based non-parallel voice conversion

T Kaneko, H Kameoka, K Tanaka… - ICASSP 2019-2019 …, 2019 - ieeexplore.ieee.org

Non-parallel voice conversion (VC) is a technique for learning the mapping from source to
target speech without relying on parallel data. This is an important task, but it has been …

被引用次数：308 相关文章所有 8 个版本

[HTML] sciencedirect.com

[HTML][HTML] Emotional voice conversion: Theory, databases and ESD

K Zhou, B Sisman, R Liu, H Li - Speech Communication, 2022 - Elsevier

In this paper, we first provide a review of the state-of-the-art emotional voice conversion
research, and the existing emotional speech databases. We then motivate the development …

被引用次数：119 相关文章所有 6 个版本

[PDF] arxiv.org

F0-consistent many-to-many non-parallel voice conversion via conditional autoencoder

K Qian, Z Jin, M Hasegawa-Johnson… - ICASSP 2020-2020 …, 2020 - ieeexplore.ieee.org

Non-parallel many-to-many voice conversion remains an interesting but challenging speech
processing task. Many style-transfer-inspired methods such as generative adversarial …

被引用次数：109 相关文章所有 6 个版本

[PDF] arxiv.org

Non-parallel sequence-to-sequence voice conversion with disentangled linguistic and speaker representations

JX Zhang, ZH Ling, LR Dai - IEEE/ACM Transactions on Audio …, 2019 - ieeexplore.ieee.org

This article presents a method of sequence-to-sequence (seq2seq) voice conversion using
non-parallel training data. In this method, disentangled linguistic and speaker …

被引用次数：124 相关文章所有 5 个版本

[PDF] ieee.org

Silent speech interfaces for speech restoration: A review

JA Gonzalez-Lopez, A Gomez-Alanis… - IEEE …, 2020 - ieeexplore.ieee.org

This review summarises the status of silent speech interface (SSI) research. SSIs rely on non-
acoustic biosignals generated by the human body during speech production to enable …

被引用次数：86 相关文章所有 7 个版本

[PDF] arxiv.org

Any-to-many voice conversion with location-relative sequence-to-sequence modeling

S Liu, Y Cao, D Wang, X Wu, X Liu… - IEEE/ACM Transactions …, 2021 - ieeexplore.ieee.org

This paper proposes an any-to-many location-relative, sequence-to-sequence (seq2seq),
non-parallel voice conversion approach, which utilizes text supervision during training. In …

被引用次数：77 相关文章所有 8 个版本

高级搜索

QQ 群