Human motion generation: A survey

W Zhu, X Ma, D Ro, H Ci, J Zhang, J Shi… - … on Pattern Analysis …, 2023 - ieeexplore.ieee.org
Human motion generation aims to generate natural human pose sequences and shows
immense potential for real-world applications. Substantial progress has been made recently …

Self-driving laboratories for chemistry and materials science

G Tom, SP Schmid, SG Baird, Y Cao, K Darvish… - Chemical …, 2024 - ACS Publications
Self-driving laboratories (SDLs) promise an accelerated application of the scientific method.
Through the automation of experimental workflows, along with autonomous experimental …

Voxposer: Composable 3d value maps for robotic manipulation with language models

W Huang, C Wang, R Zhang, Y Li, J Wu… - arXiv preprint arXiv …, 2023 - arxiv.org
Large language models (LLMs) are shown to possess a wealth of actionable knowledge that
can be extracted for robot manipulation in the form of reasoning and planning. Despite the …

Tidybot: Personalized robot assistance with large language models

J Wu, R Antonova, A Kan, M Lepert, A Zeng, S Song… - Autonomous …, 2023 - Springer
For a robot to personalize physical assistance effectively, it must learn user preferences that
can be generally reapplied to future scenarios. In this work, we investigate personalization of …

Scalable 3d captioning with pretrained models

T Luo, C Rockwell, H Lee… - Advances in Neural …, 2024 - proceedings.neurips.cc
We introduce Cap3D, an automatic approach for generating descriptive text for 3D objects.
This approach utilizes pretrained models from image captioning, image-text alignment, and …

Foundation models in robotics: Applications, challenges, and the future

R Firoozi, J Tucker, S Tian… - … Journal of Robotics …, 2023 - journals.sagepub.com
We survey applications of pretrained foundation models in robotics. Traditional deep
learning models in robotics are trained on small datasets tailored for specific tasks, which …

Mimicplay: Long-horizon imitation learning by watching human play

C Wang, L Fan, J Sun, R Zhang, L Fei-Fei, D Xu… - arXiv preprint arXiv …, 2023 - arxiv.org
Imitation learning from human demonstrations is a promising paradigm for teaching robots
manipulation skills in the real world. However, learning complex long-horizon tasks often …

Building cooperative embodied agents modularly with large language models

H Zhang, W Du, J Shan, Q Zhou, Y Du… - arXiv preprint arXiv …, 2023 - arxiv.org
Large Language Models (LLMs) have demonstrated impressive planning abilities in single-
agent embodied tasks across various domains. However, their capacity for planning and …

Vima: Robot manipulation with multimodal prompts

Y Jiang, A Gupta, Z Zhang, G Wang, Y Dou, Y Chen… - 2023 - openreview.net
Prompt-based learning has emerged as a successful paradigm in natural language
processing, where a single general-purpose language model can be instructed to perform …

Octopus: Embodied vision-language programmer from environmental feedback

J Yang, Y Dong, S Liu, B Li, Z Wang, C Jiang… - arXiv preprint arXiv …, 2023 - arxiv.org
Large vision-language models (VLMs) have achieved substantial progress in multimodal
perception and reasoning. Furthermore, when seamlessly integrated into an embodied …