A Comprehensive Review of Data‐Driven Co‐Speech Gesture Generation

S Nyatsanga, T Kucherenko, C Ahuja… - Computer Graphics …, 2023 - Wiley Online Library
Gestures that accompany speech are an essential part of natural and efficient embodied
human communication. The automatic generation of such co‐speech gestures is a long …

Learning in audio-visual context: A review, analysis, and new perspective

Y Wei, D Hu, Y Tian, X Li - arXiv preprint arXiv:2208.09579, 2022 - arxiv.org
Sight and hearing are two senses that play a vital role in human communication and scene
understanding. To mimic human perception ability, audio-visual learning, aimed at …

Listen, denoise, action! audio-driven motion synthesis with diffusion models

S Alexanderson, R Nagy, J Beskow… - ACM Transactions on …, 2023 - dl.acm.org
Diffusion models have experienced a surge of interest as highly expressive yet efficiently
trainable probabilistic models. We show that these models are an excellent fit for …

Rhythmic gesticulator: Rhythm-aware co-speech gesture synthesis with hierarchical neural embeddings

T Ao, Q Gao, Y Lou, B Chen, L Liu - ACM Transactions on Graphics …, 2022 - dl.acm.org
Automatic synthesis of realistic co-speech gestures is an increasingly important yet
challenging task in artificial embodied agent creation. Previous systems mainly focus on …

Learning hierarchical cross-modal association for co-speech gesture generation

X Liu, Q Wu, H Zhou, Y Xu, R Qian… - Proceedings of the …, 2022 - openaccess.thecvf.com
Generating speech-consistent body and gesture movements is a long-standing problem in
virtual avatar creation. Previous studies often synthesize pose movement in a holistic …

Style‐controllable speech‐driven gesture synthesis using normalising flows

S Alexanderson, GE Henter… - Computer Graphics …, 2020 - Wiley Online Library
Automatic synthesis of realistic gestures promises to transform the fields of animation,
avatars and communicative agents. In off‐line applications, novel tools can alter the role of …

Moglow: Probabilistic and controllable motion synthesis using normalising flows

GE Henter, S Alexanderson, J Beskow - ACM Transactions on Graphics …, 2020 - dl.acm.org
Data-driven modelling and synthesis of motion is an active research area with applications
that include animation, games, and social robotics. This paper introduces a new class of …

A motion matching-based framework for controllable gesture synthesis from speech

I Habibie, M Elgharib, K Sarkar, A Abdullah… - ACM SIGGRAPH 2022 …, 2022 - dl.acm.org
Recent deep learning-based approaches have shown promising results for synthesizing
plausible 3D human gestures from speech input. However, these approaches typically offer …

Audio-driven co-speech gesture video generation

X Liu, Q Wu, H Zhou, Y Du, W Wu… - Advances in Neural …, 2022 - proceedings.neurips.cc
Co-speech gesture is crucial for human-machine interaction and digital entertainment. While
previous works mostly map speech audio to human skeletons (eg, 2D keypoints), directly …

Style transfer for co-speech gesture animation: A multi-speaker conditional-mixture approach

C Ahuja, DW Lee, YI Nakano, LP Morency - Computer Vision–ECCV 2020 …, 2020 - Springer
How can we teach robots or virtual assistants to gesture naturally? Can we go further and
adapt the gesturing style to follow a specific speaker? Gestures that are naturally timed with …