An overview of voice conversion and its challenges: From statistical modeling to deep learning

B Sisman, J Yamagishi, S King… - IEEE/ACM Transactions …, 2020 - ieeexplore.ieee.org
Speaker identity is one of the important characteristics of human speech. In voice
conversion, we change the speaker identity from one to another, while keeping the linguistic …

[图书][B] Prosodic patterns in English conversation

NG Ward - 2019 - books.google.com
Language is more than words: it includes the prosodic features and patterns that we use,
subconsciously, to frame meanings and achieve our goals in our interaction with others …

[图书][B] Virtual humans: Today and tomorrow

D Burden, M Savin-Baden - 2019 - taylorfrancis.com
Virtual Humans provides a much-needed definition of what constitutes a 'virtual human'and
places virtual humans within the wider context of Artificial Intelligence development. It …

A survey on speech synthesis techniques in Indian languages

SP Panda, AK Nayak, SC Rai - Multimedia Systems, 2020 - Springer
The text to speech technology has achieved significant progress during the past decade and
is an active area of research and development in providing different human–computer …

Articulatory control of HMM-based parametric speech synthesis using feature-space-switched multiple regression

ZH Ling, K Richmond… - IEEE Transactions on …, 2012 - ieeexplore.ieee.org
In previous work we proposed a method to control the characteristics of synthetic speech
flexibly by integrating articulatory features into a hidden Markov model (HMM) based …

Automatic discovery of a phonetic inventory for unwritten languages for statistical speech synthesis

PK Muthukumar, AW Black - 2014 IEEE International …, 2014 - ieeexplore.ieee.org
Speech synthesis systems are typically built with speech data and transcriptions. In this
paper, we try to build synthesis systems when no transcriptions or knowledge about the …

Use of articulatory bottle-neck features for query-by-example spoken term detection in low resource scenarios

G Mantena, K Prahallad - 2014 IEEE international conference …, 2014 - ieeexplore.ieee.org
For query-by-example spoken term detection (QbE-STD), generation of phone
posteriorgrams requires labelled data which would be difficult for languages with low …

Speech synthesis from found data

P Baljekar - 2018 - kilthub.cmu.edu
Text-to-speech synthesis (TTS) has progressed to such a stage that given a large, clean,
phonetically balanced dataset from a single speaker, it can produce intelligible, almost …

[PDF][PDF] Using articulatory features and inferred phonological segments in zero resource speech processing.

P Baljekar, S Sitaram, PK Muthukumar, AW Black - INTERSPEECH, 2015 - isca-archive.org
Unsupervised discovery of subword units is an important problem in recognition and
synthesis of zero-resource languages, in which phonesets may not be known and the only …

Articulatory features for speech-driven head motion synthesis

AB Youssef, H Shimodaira, DA Braude - Proc. Interspeech, 2013 - research.ed.ac.uk
This study investigates the use of articulatory features for speech-driven head motion
synthesis as opposed to prosody features such as F0 and energy which have been mainly …