A survey on integration of large language models with intelligent robots

Y Kim, D Kim, J Choi, J Park, N Oh, D Park - Intelligent Service Robotics, 2024 - Springer
In recent years, the integration of large language models (LLMs) has revolutionized the field
of robotics, enabling robots to communicate, understand, and reason with human-like …

Harmon: Whole-body motion generation of humanoid robots from language descriptions

Z Jiang, Y Xie, J Li, Y Yuan, Y Zhu, Y Zhu - arXiv preprint arXiv …, 2024 - arxiv.org
Humanoid robots, with their human-like embodiment, have the potential to integrate
seamlessly into human environments. Critical to their coexistence and cooperation with …

Llf-bench: Benchmark for interactive learning from language feedback

CA Cheng, A Kolobov, D Misra, A Nie… - arXiv preprint arXiv …, 2023 - arxiv.org
We introduce a new benchmark, LLF-Bench (Learning from Language Feedback
Benchmark; pronounced as" elf-bench"), to evaluate the ability of AI agents to interactively …

Autoregressive action sequence learning for robotic manipulation

X Zhang, Y Liu, H Chang, L Schramm… - arXiv preprint arXiv …, 2024 - arxiv.org
Autoregressive models have demonstrated remarkable success in natural language
processing. In this work, we design a simple yet effective autoregressive architecture for …

VLM Agents Generate Their Own Memories: Distilling Experience into Embodied Programs of Thought

GH Sarch, L Jang, MJ Tarr, WW Cohen… - The Thirty-eighth …, 2024 - openreview.net
Large-scale generative language and vision-language models (LLMs and VLMs) excel in
few-shot in-context learning for decision making and instruction following. However, they …

基于大模型的具身智能系统综述

王文晟, 谭宁, 黄凯, 张雨浓, 郑伟诗, 孙富春 - 自动化学报, 2025 - aas.net.cn
得益于近期具有世界知识的大规模预训练模型的迅速发展, 基于大模型的具身智能在各类任务中
取得了良好的效果, 展现出了强大的泛化能力与在各领域内广阔的应用前景. 鉴于此 …

VLM Agents Generate Their Own Memories: Distilling Experience into Embodied Programs

G Sarch, L Jang, MJ Tarr, WW Cohen, K Marino… - arXiv preprint arXiv …, 2024 - arxiv.org
Large-scale generative language and vision-language models excel in in-context learning
for decision making. However, they require high-quality exemplar demonstrations to be …

InterPreT: Interactive Predicate Learning from Language Feedback for Generalizable Task Planning

M Han, Y Zhu, SC Zhu, YN Wu, Y Zhu - arXiv preprint arXiv:2405.19758, 2024 - arxiv.org
Learning abstract state representations and knowledge is crucial for long-horizon robot
planning. We present InterPreT, an LLM-powered framework for robots to learn symbolic …

CLIP-RT: Learning Language-Conditioned Robotic Policies from Natural Language Supervision

GC Kang, J Kim, K Shim, JK Lee, BT Zhang - arXiv preprint arXiv …, 2024 - arxiv.org
This paper explores how non-experts can teach robots desired skills in their environments.
We argue that natural language is an intuitive and accessible interface for robot learning. To …

Mapping out the Space of Human Feedback for Reinforcement Learning: A Conceptual Framework

Y Metz, D Lindner, R Baur, M El-Assady - arXiv preprint arXiv:2411.11761, 2024 - arxiv.org
Reinforcement Learning from Human feedback (RLHF) has become a powerful tool to fine-
tune or train agentic machine learning models. Similar to how humans interact in social …