TUNDRA: a multilingual corpus of found data for TTS research created with light supervision

A Stan, O Watts, Y Mamiya, M Giurgiu… - … 2013, 14th Annual …, 2013 - research.ed.ac.uk
Abstract Simple4All Tundra (version 1.0) is the first release of a standardised multilingual
corpus designed for text-to-speech research with imperfect or found data. The corpus …

ALISA: An automatic lightly supervised speech segmentation and alignment tool

A Stan, Y Mamiya, J Yamagishi, P Bell, O Watts… - Computer Speech & …, 2016 - Elsevier
This paper describes the ALISA tool, which implements a lightly supervised method for
sentence-level alignment of speech with imperfect transcripts. Its intended use is to enable …

Unsupervised and lightly-supervised learning for rapid construction of TTS systems in multiple languages fromfound'data: evaluation and analysis

O Watts, A Stan, R Clark, Y Mamiya… - 8th ISCA Speech …, 2013 - research.ed.ac.uk
This paper presents techniques for building text-to-speech frontends in a way that avoids the
need for language-specific expert knowledge, but instead relies on universal resources …

A preliminary Plains Cree speech synthesizer

A Harrigan, T Mills, A Arppe - Proceedings of the Workshop …, 2019 - journals.colorado.edu
This paper discusses the development and evaluation of a Speech Synthesizer for Plains
Cree, an Algonquian language of North America. Synthesis is achieved using Simple4All …

[PDF][PDF] Investigating the Robustness of Sequence-to-Sequence Text-to-Speech Models to Imperfectly-Transcribed Training Data.

J Fong, PO Gallegos, Z Hodari, S King - INTERSPEECH, 2019 - isca-archive.org
Abstract Sequence-to-sequence (S2S) text-to-speech (TTS) models can synthesise high
quality speech when large amounts of annotated training data are available. Transcription …

Controlling text-to-speech pronunciation using limited linguistic resources

J Fong - 2024 - era.ed.ac.uk
Correct pronunciation is essential for high-quality text-to-speech (TTS) systems. To achieve
this, the majority of TTS systems rely on phonemes as an intermediate representation …

The Simple4All entry to the blizzard challenge 2013

O Watts, A Stan, Y Mamiya, A Suni… - Proc. Blizzard …, 2013 - research.ed.ac.uk
We describe the synthetic voices entered into the 2013 Blizzard Challenge by the
SIMPLE4ALL consortium. The 2013 Blizzard Challenge presents an opportunity to test and …

Building synthetic voices for under-resourced languages: The feasibility of using audiobook data

F de Wet, W Van der Walt, N Dlamini… - … Association of South …, 2017 - ieeexplore.ieee.org
Creating synthetic voices that are natural and intelligible is a daunting challenge for well-
resourced languages. The challenge is much bigger for languages in which the speech and …

Combining lightly-supervised learning and user feedback to construct andimprove a statistical parametric speech synthesizer for malay

LC Yong, O Watts, S King - Research Journal of Applied …, 2015 - research.ed.ac.uk
In this study, we aim to reduce the human effort in preparing training data for synthesizing
human speech and improve the quality of synthetic speech. In spite of the learning-from-data …

[图书][B] Query-by-Example Spoken Term Detection for Low-resource Languages

H Wang - 2014 - search.proquest.com
In this thesis, we consider the problem of query-by-example (QbyE) spoken term detection
(STD) for low-resource languages. The problem is to automatically detect and locate the …