Can we steal your vocal identity from the Internet?: Initial investigation of cloning Obama's...

N Hajarolasvadi, MA Ramirez, W Beccaro… - IEEE …, 2020 - ieeexplore.ieee.org

Deep generative models have become an emerging topic in various research areas like
computer vision and signal processing. These models allow synthesizing realistic data …

被引用次数：25 相关文章所有 15 个版本

Deep learning serves voice cloning: how vulnerable are automatic speaker veriﬁcation systems to spooﬁng trials?

P Partila, J Tovarek, GH Ilk, J Rozhon… - IEEE Communications …, 2020 - ieeexplore.ieee.org

This article verifies the reliability of automatic speaker verification (ASV) systems on new
synthesis methods based on deep neural networks. ASV systems are widely used and …

被引用次数：35 相关文章所有 3 个版本

[PDF] kth.se

Casting to corpus: Segmenting and selecting spontaneous dialogue for TTS with a CNN-LSTM speaker-dependent breath detector

É Székely, GE Henter… - ICASSP 2019-2019 IEEE …, 2019 - ieeexplore.ieee.org

This paper considers utilising breaths to create improved spontaneous-speech corpora for
conversational text-to-speech from found audio recordings such as dialogue podcasts …

被引用次数：40 相关文章所有 5 个版本

[PDF] isca-archive.org

[PDF][PDF] Speaker recognition-assisted robust audio deepfake detection.

J Pan, S Nie, H Zhang, S He, K Zhang, S Liang… - Interspeech, 2022 - isca-archive.org

Audio deepfake detection is usually formulated as a binary classification between genuine
and fake speech for an entire utterance. Environmental clues such as background and …

被引用次数：9 相关文章所有 6 个版本

[PDF] arxiv.org

Vaw-gan for singing voice conversion with non-parallel training data

J Lu, K Zhou, B Sisman, H Li - 2020 Asia-Pacific Signal and …, 2020 - ieeexplore.ieee.org

Singing voice conversion aims to convert singer's voice from source to target without
changing singing content. Parallel training data is typically required for the training of …

被引用次数：21 相关文章所有 4 个版本

Applications of deep learning to audio generation

Y Zhao, X Xia, R Togneri - IEEE Circuits and Systems …, 2019 - ieeexplore.ieee.org

In the recent past years, deep learning based machine learning systems have demonstrated
remarkable success for a wide range of learning tasks in multiple domains such as computer …

被引用次数：22 相关文章所有 2 个版本

[PDF] arxiv.org

Noise tokens: Learning neural noise templates for environment-aware speech enhancement

H Li, J Yamagishi - arXiv preprint arXiv:2004.04001, 2020 - arxiv.org

In recent years, speech enhancement (SE) has achieved impressive progress with the
success of deep neural networks (DNNs). However, the DNN approach usually fails to …

被引用次数：19 相关文章所有 8 个版本

[PDF] isca-archive.org

[PDF][PDF] High-quality Voice Conversion Using Spectrogram-Based WaveNet Vocoder.

K Chen, B Chen, J Lai, K Yu - Interspeech, 2018 - isca-archive.org

Waveform generator is a key component in voice conversion. Recently, WaveNet waveform
generator conditioned on the Mel-cepstrum (Mcep) has shown better quality over standard …

被引用次数：28 相关文章所有 3 个版本

[HTML] nature.com

[HTML][HTML] Reconstruction of Iberian ceramic potteries using generative adversarial networks

P Navarro, C Cintas, M Lucena, JM Fuertes… - Scientific reports, 2022 - nature.com

Several aspects of past culture, including historical trends, are inferred from time-based
patterns observed in archaeological artifacts belonging to different periods. The presence …

被引用次数：5 相关文章所有 11 个版本

[HTML] mdpi.com

[HTML][HTML] Manipulating voice attributes by adversarial learning of structured disentangled representations

L Benaroya, N Obin, A Roebel - Entropy, 2023 - mdpi.com

Voice conversion (VC) consists of digitally altering the voice of an individual to manipulate
part of its content, primarily its identity, while maintaining the rest unchanged. Research in …

被引用次数：4 相关文章所有 13 个版本

高级搜索

QQ 群