相关文章- 学术资源搜索

A survey on neural speech synthesis

X Tan, T Qin, F Soong, TY Liu - arXiv preprint arXiv:2106.15561, 2021 - arxiv.org

Text to speech (TTS), or speech synthesis, which aims to synthesize intelligible and natural
speech given text, is a hot research topic in speech, language, and machine learning …

被引用次数：392 相关文章所有 2 个版本

[PDF] aaai.org

Robutrans: A robust transformer-based text-to-speech model

N Li, Y Liu, Y Wu, S Liu, S Zhao, M Liu - Proceedings of the AAAI …, 2020 - ojs.aaai.org

Recently, neural network based speech synthesis has achieved outstanding results, by
which the synthesized audios are of excellent quality and naturalness. However, current …

被引用次数：46 相关文章所有 7 个版本

[PDF] arxiv.org

Wave-tacotron: Spectrogram-free end-to-end text-to-speech synthesis

RJ Weiss, RJ Skerry-Ryan, E Battenberg… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org

We describe a sequence-to-sequence neural network which directly generates speech
waveforms from text inputs. The architecture extends the Tacotron model by incorporating a …

被引用次数：119 相关文章所有 9 个版本

[PDF] arxiv.org

Semi-supervised generative modeling for controllable speech synthesis

R Habib, S Mariooryad, M Shannon… - arXiv preprint arXiv …, 2019 - arxiv.org

We present a novel generative model that combines state-of-the-art neural text-to-speech
(TTS) with semi-supervised probabilistic latent variable models. By providing partial …

被引用次数：59 相关文章所有 4 个版本

[PDF] arxiv.org

Diff-tts: A denoising diffusion model for text-to-speech

M Jeong, H Kim, SJ Cheon, BJ Choi, NS Kim - arXiv preprint arXiv …, 2021 - arxiv.org

Although neural text-to-speech (TTS) models have attracted a lot of attention and succeeded
in generating human-like speech, there is still room for improvements to its naturalness and …

被引用次数：180 相关文章所有 4 个版本

[PDF] arxiv.org

Flowtron: an autoregressive flow-based generative network for text-to-speech synthesis

R Valle, K Shih, R Prenger, B Catanzaro - arXiv preprint arXiv:2005.05957, 2020 - arxiv.org

In this paper we propose Flowtron: an autoregressive flow-based generative network for text-
to-speech synthesis with control over speech variation and style transfer. Flowtron borrows …

被引用次数：170 相关文章所有 3 个版本

[PDF] arxiv.org

Controllable neural text-to-speech synthesis using intuitive prosodic features

T Raitio, R Rasipuram, D Castellani - arXiv preprint arXiv:2009.06775, 2020 - arxiv.org

Modern neural text-to-speech (TTS) synthesis can generate speech that is indistinguishable
from natural speech. However, the prosody of generated utterances often represents the …

被引用次数：86 相关文章所有 7 个版本

[PDF] arxiv.org

Delightfultts: The microsoft speech synthesis system for blizzard challenge 2021

Y Liu, Z Xu, G Wang, K Chen, B Li, X Tan, J Li… - arXiv preprint arXiv …, 2021 - arxiv.org

This paper describes the Microsoft end-to-end neural text to speech (TTS) system:
DelightfulTTS for Blizzard Challenge 2021. The goal of this challenge is to synthesize …

被引用次数：61 相关文章所有 4 个版本

[HTML] mdpi.com

[HTML][HTML] A review of deep learning based speech synthesis

Y Ning, S He, Z Wu, C Xing, LJ Zhang - Applied Sciences, 2019 - mdpi.com

Speech synthesis, also known as text-to-speech (TTS), has attracted increasingly more
attention. Recent advances on speech synthesis are overwhelmingly contributed by deep …

被引用次数：193 相关文章所有 6 个版本

[PDF] arxiv.org

Deep voice 3: Scaling text-to-speech with convolutional sequence learning

W Ping, K Peng, A Gibiansky, SO Arik… - arXiv preprint arXiv …, 2017 - arxiv.org

We present Deep Voice 3, a fully-convolutional attention-based neural text-to-speech (TTS)
system. Deep Voice 3 matches state-of-the-art neural speech synthesis systems in …

被引用次数：543 相关文章所有 5 个版本

高级搜索

QQ 群

A survey on neural speech synthesis

Robutrans: A robust transformer-based text-to-speech model

Wave-tacotron: Spectrogram-free end-to-end text-to-speech synthesis

Semi-supervised generative modeling for controllable speech synthesis

Diff-tts: A denoising diffusion model for text-to-speech

Flowtron: an autoregressive flow-based generative network for text-to-speech synthesis

Controllable neural text-to-speech synthesis using intuitive prosodic features

Delightfultts: The microsoft speech synthesis system for blizzard challenge 2021

[HTML][HTML] A review of deep learning based speech synthesis

Deep voice 3: Scaling text-to-speech with convolutional sequence learning

引用