Freevc: Towards high-quality text-free one-shot voice conversion

JP Cardenuto, J Yang, R Padilha… - … on Signal and …, 2023 - nowpublishers.com

Synthetic realities are digital creations or augmentations that are contextually generated
through the use of Artificial Intelligence (AI) methods, leveraging extensive amounts of data …

被引用次数：23 相关文章所有 7 个版本

[PDF] mdpi.com

Overview of voice conversion methods based on deep learning

T Walczyna, Z Piotrowski - Applied sciences, 2023 - mdpi.com

Voice conversion is a process where the essence of a speaker's identity is seamlessly
transferred to another speaker, all while preserving the content of their speech. This usage is …

被引用次数：28 相关文章所有 5 个版本

[PDF] cell.com Full View

Deepfakes as a threat to a speaker and facial recognition: An overview of tools and attack vectors

A Firc, K Malinka, P Hanáček - Heliyon, 2023 - cell.com

Deepfakes present an emerging threat in cyberspace. Recent developments in machine
learning make deepfakes highly believable, and very difficult to differentiate between what is …

被引用次数：21 相关文章所有 7 个版本

[PDF] arxiv.org

Voice conversion with just nearest neighbors

M Baas, B van Niekerk, H Kamper - arXiv preprint arXiv:2305.18975, 2023 - arxiv.org

Any-to-any voice conversion aims to transform source speech into a target voice with just a
few examples of the target speaker as a reference. Recent methods produce convincing …

被引用次数：39 相关文章所有 6 个版本

[PDF] arxiv.org

Mlaad: The multi-language audio anti-spoofing dataset

NM Müller, P Kawa, WH Choong, E Casanova… - arXiv preprint arXiv …, 2024 - arxiv.org

Text-to-Speech (TTS) technology brings significant advantages, such as giving a voice to
those with speech impairments, but also enables audio deepfakes and spoofs. The former …

被引用次数：20 相关文章所有 2 个版本

[PDF] arxiv.org

Refxvc: Cross-lingual voice conversion with enhanced reference leveraging

M Zhang, Y Zhou, Y Ren, C Zhang… - … /ACM Transactions on …, 2024 - ieeexplore.ieee.org

This paper proposes RefXVC, a method for crosslingual voice conversion (XVC) that
leverages reference information to improve conversion performance. Previous XVC works …

被引用次数：2 相关文章所有 4 个版本

[PDF] arxiv.org

Fake the real: Backdoor attack on deep speech classification via voice conversion

Z Ye, T Mao, L Dong, D Yan - arXiv preprint arXiv:2306.15875, 2023 - arxiv.org

Deep speech classification has achieved tremendous success and greatly promoted the
emergence of many real-world applications. However, backdoor attacks present a new …

被引用次数：11 相关文章所有 4 个版本

[PDF] arxiv.org

Openvoice: Versatile instant voice cloning

Z Qin, W Zhao, X Yu, X Sun - arXiv preprint arXiv:2312.01479, 2023 - arxiv.org

We introduce OpenVoice, a versatile voice cloning approach that requires only a short audio
clip from the reference speaker to replicate their voice and generate speech in multiple …

被引用次数：18 相关文章所有 2 个版本

[PDF] arxiv.org

StreamVC: Real-Time Low-Latency Voice Conversion

Y Yang, Y Kartynnik, Y Li, J Tang, X Li… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org

We present StreamVC, a streaming voice conversion solution that preserves the content and
prosody of any source speech while matching the voice timbre from any target speech …

被引用次数：8 相关文章所有 6 个版本

[PDF] arxiv.org

Learning Disentangled Speech Representations with Contrastive Learning and Time-Invariant Retrieval

Y Deng, H Tang, X Zhang, N Cheng… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org

Voice conversion refers to transferring speaker identity with well-preserved content. Better
disentanglement of speech representations leads to better voice conversion. Recent studies …

被引用次数：3 相关文章所有 4 个版本

高级搜索

QQ 群