Z Khanjani, G Watson, VP Janeja - Frontiers in Big Data, 2023 - frontiersin.org
A deepfake is content or material that is synthetically generated or manipulated using artificial intelligence (AI) methods, to be passed off as real and can include audio, video …
In this paper, we present a novel technique for a non-parallel voice conversion (VC) with the use of cyclic variational autoencoder (CycleVAE)-based spectral modeling. In a variational …
M Zhang, Y Zhou, L Zhao, H Li - IEEE/ACM Transactions on …, 2021 - ieeexplore.ieee.org
We present a novel voice conversion (VC) framework by learning from a text-to-speech (TTS) synthesis system, that is called TTS-VC transfer learning or TTL-VC for short. We first …
The Science of Deep Learning emerged from courses taught by the author that have provided thousands of students with training and experience for their academic studies, and …
Over the past few years, significant progress has been made in the field of presentation attack detection (PAD) for automatic speaker recognition (ASV). This includes the …
We describe our submitted system for the ZeroSpeech Challenge 2019. The current challenge theme addresses the difficulty of constructing a speech synthesizer without any …
SC Yang, M Tantrawenith, H Zhuang, Z Wu… - arXiv preprint arXiv …, 2022 - arxiv.org
One-shot voice conversion (VC) with only a single target speaker's speech for reference has become a hot research topic. Existing works generally disentangle timbre, while information …
We propose a nonparallel data-driven emotional speech conversion method. It enables the transfer of emotion-related characteristics of a speech signal while preserving the speaker's …