Review of end-to-end speech synthesis technology based on deep learning

Z Mu, X Yang, Y Dong - arXiv preprint arXiv:2104.09995, 2021 - arxiv.org
As an indispensable part of modern human-computer interaction system, speech synthesis
technology helps users get the output of intelligent machine more easily and intuitively, thus …

A review of deep learning based speech synthesis

Y Ning, S He, Z Wu, C Xing, LJ Zhang - Applied Sciences, 2019 - mdpi.com
Speech synthesis, also known as text-to-speech (TTS), has attracted increasingly more
attention. Recent advances on speech synthesis are overwhelmingly contributed by deep …

[PDF][PDF] Text to speech synthesis: a systematic review, deep learning based architecture and future research direction

F Khanam, FA Munmun, NA Ritu, AK Saha… - Journal of Advances in …, 2022 - academia.edu
Text to Speech (TTS) synthesis is a process of translating natural language text into speech.
Pieces of recorded speech generate synthesized speech and a database is maintained for …

Improving mandarin end-to-end speech synthesis by self-attention and learnable gaussian bias

F Yang, S Yang, P Zhu, P Yan… - 2019 IEEE automatic …, 2019 - ieeexplore.ieee.org
Compared to conventional speech synthesis, end-to-end speech synthesis has achieved
much better naturalness with more simplified system building pipeline. End-to-end …

Deep learning based multilingual speech synthesis using multi feature fusion methods

P Nuthakki, M Katamaneni, CS JN, K Gubbala… - ACM Transactions on …, 2023 - dl.acm.org
The poor intelligibility and out-of-the-ordinary nature of the traditional concatenation speech
synthesis technologies are two major problems. CNN's context deep learning approaches …

Naturalspeech: End-to-end text-to-speech synthesis with human-level quality

X Tan, J Chen, H Liu, J Cong, C Zhang… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org
Text-to-speech (TTS) has made rapid progress in both academia and industry in recent
years. Some questions naturally arise that whether a TTS system can achieve human-level …

Mongolian text-to-speech system based on deep neural network

R Liu, F Bao, G Gao, Y Wang - … 2017, Lianyungang, China, October 11–13 …, 2018 - Springer
Abstract Recently, Deep Neural Network (DNN), which is a feed-forward artificial neural
network with many hidden layers, has opened a new research direction for Speech …

JSUT corpus: free large-scale Japanese speech corpus for end-to-end speech synthesis

R Sonobe, S Takamichi, H Saruwatari - arXiv preprint arXiv:1711.00354, 2017 - arxiv.org
Thanks to improvements in machine learning techniques including deep learning, a free
large-scale speech corpus that can be shared between academic institutions and …

Conventional and contemporary approaches used in text to speech synthesis: A review

N Kaur, P Singh - Artificial Intelligence Review, 2023 - Springer
Nowadays speech synthesis or text to speech (TTS), an ability of system to produce human
like natural sounding voice from the written text, is gaining popularity in the field of speech …

Neural speech synthesis with transformer network

N Li, S Liu, Y Liu, S Zhao, M Liu - … of the AAAI conference on artificial …, 2019 - ojs.aaai.org
Although end-to-end neural text-to-speech (TTS) methods (such as Tacotron2) are proposed
and achieve state-of-theart performance, they still suffer from two problems: 1) low efficiency …