One model, many languages: Meta-learning for multilingual text-to-speech

V Pratap, A Tjandra, B Shi, P Tomasello, A Babu… - Journal of Machine …, 2024 - jmlr.org

Expanding the language coverage of speech technology has the potential to improve
access to information for many more people. However, current speech technology is …

被引用次数：195 相关文章所有 3 个版本

[PDF] mlr.press

Yourtts: Towards zero-shot multi-speaker tts and zero-shot voice conversion for everyone

E Casanova, J Weber, CD Shulby… - International …, 2022 - proceedings.mlr.press

YourTTS brings the power of a multilingual approach to the task of zero-shot multi-speaker
TTS. Our method builds upon the VITS model and adds several novel modifications for zero …

被引用次数：321 相关文章所有 7 个版本

[PDF] arxiv.org

A survey on neural speech synthesis

X Tan, T Qin, F Soong, TY Liu - arXiv preprint arXiv:2106.15561, 2021 - arxiv.org

Text to speech (TTS), or speech synthesis, which aims to synthesize intelligible and natural
speech given text, is a hot research topic in speech, language, and machine learning …

被引用次数：397 相关文章所有 2 个版本

[PDF] sciencedirect.com

Emotional voice conversion: Theory, databases and ESD

K Zhou, B Sisman, R Liu, H Li - Speech Communication, 2022 - Elsevier

In this paper, we first provide a review of the state-of-the-art emotional voice conversion
research, and the existing emotional speech databases. We then motivate the development …

被引用次数：147 相关文章所有 7 个版本

[PDF] arxiv.org

The decades progress on code-switching research in nlp: A systematic survey on trends and challenges

GI Winata, AF Aji, ZX Yong, T Solorio - arXiv preprint arXiv:2212.09660, 2022 - arxiv.org

Code-Switching, a common phenomenon in written text and conversation, has been studied
over decades by the natural language processing (NLP) research community. Initially, code …

被引用次数：26 相关文章所有 3 个版本

[PDF] arxiv.org

SANE-TTS: stable and natural end-to-end multilingual text-to-speech

H Cho, W Jung, J Lee, SH Woo - arXiv preprint arXiv:2206.12132, 2022 - arxiv.org

In this paper, we present SANE-TTS, a stable and natural end-to-end multilingual TTS
model. By the difficulty of obtaining multilingual corpus for given speaker, training …

被引用次数：29 相关文章所有 6 个版本

[PDF] rug.nl

A systematic review and analysis of multilingual data strategies in text-to-speech for low-resource languages

P Do, M Coler, J Dijkstra, E Klabbers - Interspeech 2021, 2021 - research.rug.nl

We provide a systematic review of past studies that use multilingual data for text-to-speech
(TTS) of low-resource languages (LRLs). We focus on the strategies used by these studies …

被引用次数：13 相关文章所有 5 个版本

[PDF] arxiv.org

Many-to-many spoken language translation via unified speech and text representation learning with unit-to-unit translation

M Kim, J Choi, D Kim, YM Ro - arXiv preprint arXiv:2308.01831, 2023 - arxiv.org

In this paper, we propose a method to learn unified representations of multilingual speech
and text with a single model, especially focusing on the purpose of speech synthesis. We …

被引用次数：13 相关文章所有 2 个版本

[PDF] ieee.org

Zmm-tts: Zero-shot multilingual and multispeaker speech synthesis conditioned on self-supervised discrete speech representations

C Gong, X Wang, E Cooper, D Wells… - … on Audio, Speech …, 2024 - ieeexplore.ieee.org

Neural text-to-speech (TTS) has achieved humanlike synthetic speech for single-speaker,
single-language synthesis. Multilingual TTS systems are limited to resource-rich languages …

被引用次数：8 相关文章所有 2 个版本

[PDF] arxiv.org

Deep speech synthesis from articulatory representations

P Wu, S Watanabe, L Goldstein, AW Black… - arXiv preprint arXiv …, 2022 - arxiv.org

In the articulatory synthesis task, speech is synthesized from input features containing
information about the physical behavior of the human vocal tract. This task provides a …

被引用次数：23 相关文章所有 8 个版本

高级搜索

QQ 群