Y Wei, D Hu, Y Tian, X Li - arXiv preprint arXiv:2208.09579, 2022 - arxiv.org
Sight and hearing are two senses that play a vital role in human communication and scene understanding. To mimic human perception ability, audio-visual learning, aimed at …
Diffusion models have experienced a surge of interest as highly expressive yet efficiently trainable probabilistic models. We show that these models are an excellent fit for …
T Ao, Q Gao, Y Lou, B Chen, L Liu - ACM Transactions on Graphics …, 2022 - dl.acm.org
Automatic synthesis of realistic co-speech gestures is an increasingly important yet challenging task in artificial embodied agent creation. Previous systems mainly focus on …
Generating speech-consistent body and gesture movements is a long-standing problem in virtual avatar creation. Previous studies often synthesize pose movement in a holistic …
Automatic synthesis of realistic gestures promises to transform the fields of animation, avatars and communicative agents. In off‐line applications, novel tools can alter the role of …
Data-driven modelling and synthesis of motion is an active research area with applications that include animation, games, and social robotics. This paper introduces a new class of …
Recent deep learning-based approaches have shown promising results for synthesizing plausible 3D human gestures from speech input. However, these approaches typically offer …
Co-speech gesture is crucial for human-machine interaction and digital entertainment. While previous works mostly map speech audio to human skeletons (eg, 2D keypoints), directly …
How can we teach robots or virtual assistants to gesture naturally? Can we go further and adapt the gesturing style to follow a specific speaker? Gestures that are naturally timed with …