We propose a speech-synthesis model for predicting appropriate voice styles on the basis of the character-annotated text for audiobook speech synthesis. An audiobook is more …
M Charfuelan, I Steiner - INTERSPEECH, 2013 - isca-archive.org
This paper describes a framework for synthesis of expressive speech based on MARY TTS and Emotion Markup Language (EmotionML). We describe the creation of expressive unit …
This paper describes the ALISA tool, which implements a lightly supervised method for sentence-level alignment of speech with imperfect transcripts. Its intended use is to enable …
In this paper, we construct a Japanese audiobook speech corpus called" J-MAC" for speech synthesis research. With the success of reading-style speech synthesis, the research target …
Creating new voices for a TTS system often requires a costly procedure of designing and recording an audio corpus, a time consuming and effort intensive task. Using publicly …
EB Lange, D Thiele, MM Kuijpers - Psychology of Aesthetics …, 2022 - psycnet.apa.org
Narrative aesthetic absorption describes a state in which we focus on the story world of a narrative while becoming less aware of our surroundings and ourselves. It is characterized …
During the last decades, the majority of works devoted on expressive speech acoustic analysis have focused on emotions, although there is a growing interest in other speaking …
Nowadays, especially with the upswing of neural networks, speech synthesis is almost totally data driven. The goal of this thesis is to provide methods for automatic and …
This paper introduces a method for lightly supervised discriminative training using MMI to improve the alignment of speech and text data for use in training HMM-based TTS systems …