An overview of voice conversion systems

SH Mohammadi, A Kain - Speech Communication, 2017 - Elsevier
Voice transformation (VT) aims to change one or more aspects of a speech signal while
preserving linguistic information. A subset of VT, Voice conversion (VC) specifically aims to …

Voice conversion using deep neural networks with layer-wise generative training

LH Chen, ZH Ling, LJ Liu, LR Dai - IEEE/ACM Transactions on …, 2014 - ieeexplore.ieee.org
This paper presents a new spectral envelope conversion method using deep neural
networks (DNNs). The conventional joint density Gaussian mixture model (JDGMM) based …

Voice conversion using deep neural networks with speaker-independent pre-training

SH Mohammadi, A Kain - 2014 IEEE Spoken Language …, 2014 - ieeexplore.ieee.org
In this study, we trained a deep autoencoder to build compact representations of short-term
spectra of multiple speakers. Using this compact representation as mapping features, we …

A deep generative architecture for postfiltering in statistical parametric speech synthesis

LH Chen, T Raitio, C Valentini-Botinhao… - … on Audio, Speech …, 2015 - ieeexplore.ieee.org
The generated speech of hidden Markov model (HMM)-based statistical parametric speech
synthesis still sounds “muffled.” One cause of this degradation in speech quality may be the …

Applications of deep learning to audio generation

Y Zhao, X Xia, R Togneri - IEEE Circuits and Systems …, 2019 - ieeexplore.ieee.org
In the recent past years, deep learning based machine learning systems have demonstrated
remarkable success for a wide range of learning tasks in multiple domains such as computer …

DNN-based stochastic postfilter for HMM-based speech synthesis

LH Chen, T Raitio, C Valentini-Botinhao… - Interspeech …, 2014 - research.ed.ac.uk
In this paper we propose a deep neural network to model the conditional probability of the
spectral differences between natural and synthetic speech. This allows us to reconstruct the …

A Comprehensive Review on Speech Synthesis Using Neural-Network Based Approaches

NN Perera, GU Ganegoda - 2023 3rd International Conference …, 2023 - ieeexplore.ieee.org
Speech is a primary mode of communication between human beings. With that, the need of
creating artificial speech became a dream of humankind from the beginning of the 1980s. As …

[PDF][PDF] The HCCL-CUHK System for the Voice Conversion Challenge 2018.

S Liu, L Sun, X Wu, X Liu, H Meng - Odyssey, 2018 - researchgate.net
This paper presents the HCCL-CUHK system for the Voice Conversion Challenge 2018 (the
VCC 2018), which is mainly characterized by doing Voice Conversion (VC) with nonparallel …

[PDF][PDF] Semi-supervised training of a voice conversion mapping function using a joint-autoencoder.

SH Mohammadi, A Kain - INTERSPEECH, 2015 - isca-archive.org
Recently, researchers have begun to investigate Deep Neural Network (DNN) architectures
as mapping functions in voice conversion systems. In this study, we propose a novel …

[PDF][PDF] Voice conversion using generative trained deep neural networks with multiple frame spectral envelopes.

LH Chen, ZH Ling, LR Dai - INTERSPEECH, 2014 - isca-archive.org
This paper presents a deep neural network (DNN) based spectral envelope conversion
method. A global DNN is employed to model the complex non-linear mapping relationship …