A Comprehensive Review of Data‐Driven Co‐Speech Gesture Generation

S Nyatsanga, T Kucherenko, C Ahuja… - Computer Graphics …, 2023 - Wiley Online Library
Gestures that accompany speech are an essential part of natural and efficient embodied
human communication. The automatic generation of such co‐speech gestures is a long …

Human motion generation: A survey

W Zhu, X Ma, D Ro, H Ci, J Zhang, J Shi… - … on Pattern Analysis …, 2023 - ieeexplore.ieee.org
Human motion generation aims to generate natural human pose sequences and shows
immense potential for real-world applications. Substantial progress has been made recently …

Generating holistic 3d human motion from speech

H Yi, H Liang, Y Liu, Q Cao, Y Wen… - Proceedings of the …, 2023 - openaccess.thecvf.com
This work addresses the problem of generating 3D holistic body motions from human
speech. Given a speech recording, we synthesize sequences of 3D body poses, hand …

Beat: A large-scale semantic and emotional multi-modal dataset for conversational gestures synthesis

H Liu, Z Zhu, N Iwamoto, Y Peng, Z Li, Y Zhou… - European conference on …, 2022 - Springer
Achieving realistic, vivid, and human-like synthesized conversational gestures conditioned
on multi-modal data is still an unsolved problem due to the lack of available datasets …

Audio-driven co-speech gesture video generation

X Liu, Q Wu, H Zhou, Y Du, W Wu… - Advances in Neural …, 2022 - proceedings.neurips.cc
Co-speech gesture is crucial for human-machine interaction and digital entertainment. While
previous works mostly map speech audio to human skeletons (eg, 2D keypoints), directly …

Analyzing input and output representations for speech-driven gesture generation

T Kucherenko, D Hasegawa, GE Henter… - Proceedings of the 19th …, 2019 - dl.acm.org
This paper presents a novel framework for automatic speech-driven gesture generation,
applicable to human-agent interaction including both virtual agents and robots. Specifically …

Learning speech-driven 3d conversational gestures from video

I Habibie, W Xu, D Mehta, L Liu, HP Seidel… - Proceedings of the 21st …, 2021 - dl.acm.org
We propose the first approach to synthesize the synchronous 3D conversational body and
hand gestures, as well as 3D face and head animations, of a virtual character from speech …

Evaluation of speech-to-gesture generation using bi-directional LSTM network

D Hasegawa, N Kaneko, S Shirakawa… - Proceedings of the 18th …, 2018 - dl.acm.org
We present a novel framework to automatically generate natural gesture motions
accompanying speech from audio utterances. Based on a Bi-Directional LSTM Network, our …

Disco: Disentangled implicit content and rhythm learning for diverse co-speech gestures synthesis

H Liu, N Iwamoto, Z Zhu, Z Li, Y Zhou… - Proceedings of the 30th …, 2022 - dl.acm.org
Current co-speech gestures synthesis methods struggle with generating diverse motions
and typically collapse to single or few frequent motion sequences, which are trained on …

Modeling the conditional distribution of co-speech upper body gesture jointly using conditional-GAN and unrolled-GAN

B Wu, C Liu, CT Ishi, H Ishiguro - Electronics, 2021 - mdpi.com
Co-speech gestures are a crucial, non-verbal modality for humans to communicate. Social
agents also need this capability to be more human-like and comprehensive. This study aims …