Unsupervised learning for sequence-to-sequence text-to-speech for low-resource languages

H Zhang, Y Lin - arXiv preprint arXiv:2008.04549, 2020 - arxiv.org
Recently, sequence-to-sequence models with attention have been successfully applied in
Text-to-speech (TTS). These models can generate near-human speech with a large …

A comparison of text selection algorithms for sequence-to-sequence neural tts

S Taubert, J Sternkopf, S Kahl… - 2022 IEEE International …, 2022 - ieeexplore.ieee.org
Previous research demonstrated that text selection algorithms applied in the context of
concatenative and parametric text-to-speech (TTS) systems were able to increase synthesis …

Lightweight, Multi-speaker, Multi-lingual Indic Text-To-Speech

A Singh, A Nagireddi, A Jayakumar… - IEEE Open Journal …, 2024 - ieeexplore.ieee.org
The Lightweight, Multi-speaker, Multi-lingual Indic Text-to-Speech (LIMMITS'23) challenge is
organized as part of the ICASSP 2023 Signal Processing Grand Challenge. LIMMITS'23 …

Towards a vowel formant based quality metric for Text-to-Speech systems: Measuring monophthong naturalness

S Albrecht, R Tamboli, S Taubert, M Eibl… - 2022 IEEE 9th …, 2022 - ieeexplore.ieee.org
This contribution proposes an objective, vowel formant based quality metric for assessing
the naturalness of monophthongs synthesized by Text-to-Speech (TTS) systems. This could …

Curriculum Learning Based Approach for Faster Convergence of TTS Model

N Kaur, PK Ghosh - International Conference on Speech and Computer, 2023 - Springer
With the advent of deep learning, Text-to-Speech technology has been revolutionized, and
current state-of-the-art models are capable of synthesizing almost human-like speech …

Improve Cross-lingual Voice Cloning Using Low-quality Code-switched Data

H Zhang, Y Lin - arXiv preprint arXiv:2110.07210, 2021 - arxiv.org
Recently, sequence-to-sequence (seq-to-seq) models have been successfully applied in
text-to-speech (TTS) to synthesize speech for single-language text. To synthesize speech for …

[HTML][HTML] Designing a large recording script for open-domain English speech synthesis

S Kim, H Kim, Y Lee, B Kim, Y Won, B Kim - Phonetics and Speech …, 2021 - eksss.org
This paper proposes a method for designing a large recording script for open domain
English speech synthesis. For read-aloud style text, 12 domains and 294 sub-domains were …