Bidirectional LSTM networks employing stacked bottleneck features for expressive speech-driven head motion synthesis

K Haag, H Shimodaira - … Virtual Agents: 16th International Conference, IVA …, 2016 - Springer
Previous work in speech-driven head motion synthesis is centred around Hidden Markov
Model (HMM) based methods and data that does not show a large variability of …

[PDF][PDF] Articulatory movement prediction using deep bidirectional long short-term memory based recurrent neural networks and word/phone embeddings.

P Zhu, L Xie, Y Chen - Interspeech, 2015 - isca-archive.org
Automatic prediction of articulatory movements from speech or text can be beneficial for
many applications such as speech recognition and synthesis. A recent approach has …

[PDF][PDF] BLSTM neural networks for speech driven head motion synthesis.

C Ding, P Zhu, L Xie - Interspeech, 2015 - isca-archive.org
Head motion naturally occurs in synchrony with speech and carries important intention,
attitude and emotion factors. This paper aims to synthesize head motions from natural …

[HTML][HTML] Speech-driven head motion generation from waveforms

JH Lu, H Shimodaira - Speech Communication, 2024 - Elsevier
Head motion generation task for speech-driven virtual agent animation is commonly
explored with handcrafted audio features, such as MFCCs as input features, plus additional …

[PDF][PDF] Speech-driven head motion synthesis using neural networks

C Ding, P Zhu, L Xie, D Jiang, ZH Fu - Fifteenth Annual Conference of the …, 2014 - Citeseer
This paper presents a neural network approach for speech-driven head motion synthesis,
which can automatically predict a speaker's head movement from his/her speech …

Prediction of head motion from speech waveforms with a canonical-correlation-constrained autoencoder

JH Lu, H Shimodaira - arXiv preprint arXiv:2002.01869, 2020 - arxiv.org
This study investigates the direct use of speech waveforms to predict head motion for
speech-driven head-motion synthesis, whereas the use of spectral features such as MFCC …

Perceptual enhancement of emotional mocap head motion: An experimental study

Y Ding, L Shi, Z Deng - 2017 Seventh International Conference …, 2017 - ieeexplore.ieee.org
Motion capture (mocap) systems have been widely used to collect various human behavior
data. Despite existing numerous research efforts on mocap motion processing and …

Mapping ultrasound-based articulatory images and vowel sounds with a deep neural network framework

J Wei, Q Fang, X Zheng, W Lu, Y He, J Dang - Multimedia Tools and …, 2016 - Springer
Constructing a mapping between articulatory movements and corresponding speech could
significantly facilitate speech training and the development of speech aids for voice disorder …

Predicting articulatory movement from text using deep architecture with stacked bottleneck features

Z Wei, Z Wu, L Xie - 2016 Asia-Pacific Signal and Information …, 2016 - ieeexplore.ieee.org
Using speech or text to predict articulatory movements can have potential benefits for
speech related applications. Many approaches have been proposed to solve the acoustic-to …

[PDF][PDF] Articulatory copy synthesis based on the speech synthesizer vocaltractlab

Y Gao - 2022 - core.ac.uk
Speech is the most common mode of human communication. As shown in Figure 1.1, the
process of communication involves three main events: the production of speech, the …