Reconstructing 3D hand-face interactions with deformations from a single image is a challenging yet crucial task with broad applications in AR, VR, and gaming. The challenges …
S Xu, Z Wang, YX Wang, LY Gui - arXiv preprint arXiv:2403.19652, 2024 - arxiv.org
Text-conditioned human motion generation has experienced significant advancements with diffusion models trained on extensive motion capture data and corresponding textual …
D Daiya, D Conover, A Bera - arXiv preprint arXiv:2409.20502, 2024 - arxiv.org
We propose a novel framework COLLAGE for generating collaborative agent-object-agent interactions by leveraging large language models (LLMs) and hierarchical motion-specific …
J Liu, W Dai, C Wang, Y Cheng, Y Tang, X Tong - ecva.net
Conventional text-to-motion generation methods are usually trained on limited text-motion pairs, making them hard to generalize to open-vocabulary scenarios. Some works use the …