Large language models for robotics: A survey

F Zeng, W Gan, Y Wang, N Liu, PS Yu - arXiv preprint arXiv:2311.07226, 2023 - arxiv.org
The human ability to learn, generalize, and control complex manipulation tasks through multi-
modality feedback suggests a unique capability, which we refer to as dexterity intelligence …

The rise and potential of large language model based agents: A survey

Z Xi, W Chen, X Guo, W He, Y Ding, B Hong… - arXiv preprint arXiv …, 2023 - arxiv.org
For a long time, humanity has pursued artificial intelligence (AI) equivalent to or surpassing
the human level, with AI agents considered a promising vehicle for this pursuit. AI agents are …

Real-world robot applications of foundation models: A review

K Kawaharazuka, T Matsushima… - arXiv preprint arXiv …, 2024 - arxiv.org
Recent developments in foundation models, like Large Language Models (LLMs) and Vision-
Language Models (VLMs), trained on extensive data, facilitate flexible application across …

Octo: An open-source generalist robot policy

OM Team, D Ghosh, H Walke, K Pertsch… - arXiv preprint arXiv …, 2024 - arxiv.org
Large policies pretrained on diverse robot datasets have the potential to transform robotic
learning: instead of training new policies from scratch, such generalist robot policies may be …

On bringing robots home

NMM Shafiullah, A Rai, H Etukuru, Y Liu, I Misra… - arXiv preprint arXiv …, 2023 - arxiv.org
Throughout history, we have successfully integrated various machines into our homes.
Dishwashers, laundry machines, stand mixers, and robot vacuums are a few recent …

Droid: A large-scale in-the-wild robot manipulation dataset

A Khazatsky, K Pertsch, S Nair, A Balakrishna… - arXiv preprint arXiv …, 2024 - arxiv.org
The creation of large, diverse, high-quality robot manipulation datasets is an important
stepping stone on the path toward more capable and robust robotic manipulation policies …

CoPAL: corrective planning of robot actions with large language models

F Joublin, A Ceravola, P Smirnov, F Ocker… - arXiv preprint arXiv …, 2023 - arxiv.org
In the pursuit of fully autonomous robotic systems capable of taking over tasks traditionally
performed by humans, the complexity of open-world environments poses a considerable …

SUGAR: Pre-training 3D Visual Representations for Robotics

S Chen, R Garcia, I Laptev… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Learning generalizable visual representations from Internet data has yielded promising
results for robotics. Yet prevailing approaches focus on pre-training 2D representations …

Towards generalizable zero-shot manipulation via translating human interaction plans

H Bharadhwaj, A Gupta, V Kumar, S Tulsiani - arXiv preprint arXiv …, 2023 - arxiv.org
We pursue the goal of developing robots that can interact zero-shot with generic unseen
objects via a diverse repertoire of manipulation skills and show how passive human videos …

Grid: A platform for general robot intelligence development

S Vemprala, S Chen, A Shukla, D Narayanan… - arXiv preprint arXiv …, 2023 - arxiv.org
Developing machine intelligence abilities in robots and autonomous systems is an
expensive and time consuming process. Existing solutions are tailored to specific …