Human motion generation: A survey

W Zhu, X Ma, D Ro, H Ci, J Zhang, J Shi… - … on Pattern Analysis …, 2023 - ieeexplore.ieee.org
Human motion generation aims to generate natural human pose sequences and shows
immense potential for real-world applications. Substantial progress has been made recently …

Humangaussian: Text-driven 3d human generation with gaussian splatting

X Liu, X Zhan, J Tang, Y Shan, G Zeng… - Proceedings of the …, 2024 - openaccess.thecvf.com
Realistic 3D human generation from text prompts is a desirable yet challenging task.
Existing methods optimize 3D representations like mesh or neural fields via score distillation …

Taming diffusion models for audio-driven co-speech gesture generation

L Zhu, X Liu, X Liu, R Qian, Z Liu… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Animating virtual avatars to make co-speech gestures facilitates various applications in
human-machine interaction. The existing methods mainly rely on generative adversarial …

Gesturediffuclip: Gesture diffusion model with clip latents

T Ao, Z Zhang, L Liu - ACM Transactions on Graphics (TOG), 2023 - dl.acm.org
The automatic generation of stylized co-speech gestures has recently received increasing
attention. Previous systems typically allow style control via predefined text labels or example …

Generating holistic 3d human motion from speech

H Yi, H Liang, Y Liu, Q Cao, Y Wen… - Proceedings of the …, 2023 - openaccess.thecvf.com
This work addresses the problem of generating 3D holistic body motions from human
speech. Given a speech recording, we synthesize sequences of 3D body poses, hand …

Expressive talking head generation with granular audio-visual control

B Liang, Y Pan, Z Guo, H Zhou… - Proceedings of the …, 2022 - openaccess.thecvf.com
Generating expressive talking heads is essential for creating virtual humans. However,
existing one-or few-shot methods focus on lip-sync and head motion, ignoring the emotional …

Rhythmic gesticulator: Rhythm-aware co-speech gesture synthesis with hierarchical neural embeddings

T Ao, Q Gao, Y Lou, B Chen, L Liu - ACM Transactions on Graphics …, 2022 - dl.acm.org
Automatic synthesis of realistic co-speech gestures is an increasingly important yet
challenging task in artificial embodied agent creation. Previous systems mainly focus on …

Semantic-aware implicit neural audio-driven video portrait generation

X Liu, Y Xu, Q Wu, H Zhou, W Wu, B Zhou - European conference on …, 2022 - Springer
Animating high-fidelity video portrait with speech audio is crucial for virtual reality and digital
entertainment. While most previous studies rely on accurate explicit structural information …

Livelyspeaker: Towards semantic-aware co-speech gesture generation

Y Zhi, X Cun, X Chen, X Shen, W Guo… - Proceedings of the …, 2023 - openaccess.thecvf.com
Gestures are non-verbal but important behaviors accompanying people's speech. While
previous methods are able to generate speech rhythm-synchronized gestures, the semantic …

Large motion model for unified multi-modal motion generation

M Zhang, D Jin, C Gu, F Hong, Z Cai, J Huang… - … on Computer Vision, 2025 - Springer
Human motion generation, a cornerstone technique in animation and video production, has
widespread applications in various tasks like text-to-motion and music-to-dance. Previous …