相关文章- 学术资源搜索

Tacotron: Towards end-to-end speech synthesis

Y Wang, RJ Skerry-Ryan, D Stanton, Y Wu… - arXiv preprint arXiv …, 2017 - arxiv.org

A text-to-speech synthesis system typically consists of multiple stages, such as a text
analysis frontend, an acoustic model and an audio synthesis module. Building these …

被引用次数：2183 相关文章所有 10 个版本

[PDF] abracadoudou.com

[PDF][PDF] Tacotron: A fully end-to-end text-to-speech synthesis model

Y Wang, RJ Skerry-Ryan… - arXiv preprint …, 2017 - bengio.abracadoudou.com

ABSTRACT A text-to-speech synthesis system typically consists of multiple stages, such as a
text analysis frontend, an acoustic model and an audio synthesis module. Building these …

被引用次数：287 相关文章所有 3 个版本

[PDF] arxiv.org

Naturalspeech: End-to-end text-to-speech synthesis with human-level quality

X Tan, J Chen, H Liu, J Cong, C Zhang… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org

Text-to-speech (TTS) has made rapid progress in both academia and industry in recent
years. Some questions naturally arise that whether a TTS system can achieve human-level …

被引用次数：171 相关文章所有 9 个版本

[PDF] arxiv.org

Flowtron: an autoregressive flow-based generative network for text-to-speech synthesis

R Valle, K Shih, R Prenger, B Catanzaro - arXiv preprint arXiv:2005.05957, 2020 - arxiv.org

In this paper we propose Flowtron: an autoregressive flow-based generative network for text-
to-speech synthesis with control over speech variation and style transfer. Flowtron borrows …

被引用次数：172 相关文章所有 3 个版本

[PDF] arxiv.org

Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language

Y Yasuda, X Wang, S Takaki… - ICASSP 2019-2019 …, 2019 - ieeexplore.ieee.org

End-to-end speech synthesis is a promising approach that directly converts raw text to
speech. Although it was shown that Tacotron2 outperforms classical pipeline systems with …

被引用次数：110 相关文章所有 6 个版本

[PDF] arxiv.org

Semi-supervised training for improving data efficiency in end-to-end speech synthesis

YA Chung, Y Wang, WN Hsu, Y Zhang… - ICASSP 2019-2019 …, 2019 - ieeexplore.ieee.org

Although end-to-end text-to-speech (TTS) models such as Tacotron have shown excellent
results, they typically require a sizable set of high-quality< text, audio> pairs for training …

被引用次数：141 相关文章所有 10 个版本

[PDF] arxiv.org

Styletts: A style-based generative model for natural and diverse text-to-speech synthesis

YA Li, C Han, N Mesgarani - arXiv preprint arXiv:2205.15439, 2022 - arxiv.org

Text-to-Speech (TTS) has recently seen great progress in synthesizing high-quality speech
owing to the rapid development of parallel TTS systems, but producing speech with …

被引用次数：36 相关文章所有 2 个版本

[PDF] arxiv.org

Semi-supervised generative modeling for controllable speech synthesis

R Habib, S Mariooryad, M Shannon… - arXiv preprint arXiv …, 2019 - arxiv.org

We present a novel generative model that combines state-of-the-art neural text-to-speech
(TTS) with semi-supervised probabilistic latent variable models. By providing partial …

被引用次数：59 相关文章所有 4 个版本

[PDF] arxiv.org

End-to-end adversarial text-to-speech

J Donahue, S Dieleman, M Bińkowski, E Elsen… - arXiv preprint arXiv …, 2020 - arxiv.org

Modern text-to-speech synthesis pipelines typically involve multiple processing stages, each
of which is designed or learnt independently from the rest. In this work, we take on the …

被引用次数：217 相关文章所有 3 个版本

[PDF] arxiv.org

Delightfultts 2: End-to-end speech synthesis with adversarial vector-quantized auto-encoders

Y Liu, R Xue, L He, X Tan, S Zhao - arXiv preprint arXiv:2207.04646, 2022 - arxiv.org

Current text to speech (TTS) systems usually leverage a cascaded acoustic model and
vocoder pipeline with mel-spectrograms as the intermediate representations, which suffer …

被引用次数：28 相关文章所有 5 个版本

高级搜索

QQ 群

Tacotron: Towards end-to-end speech synthesis

[PDF][PDF] Tacotron: A fully end-to-end text-to-speech synthesis model

Naturalspeech: End-to-end text-to-speech synthesis with human-level quality

Flowtron: an autoregressive flow-based generative network for text-to-speech synthesis

Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language

Semi-supervised training for improving data efficiency in end-to-end speech synthesis

Styletts: A style-based generative model for natural and diverse text-to-speech synthesis

Semi-supervised generative modeling for controllable speech synthesis

End-to-end adversarial text-to-speech

Delightfultts 2: End-to-end speech synthesis with adversarial vector-quantized auto-encoders

相关搜索

引用