Robot learning in the era of foundation models: A survey

X Xiao, J Liu, Z Wang, Y Zhou, Y Qi, Q Cheng… - arXiv preprint arXiv …, 2023 - arxiv.org
The proliferation of Large Language Models (LLMs) has s fueled a shift in robot learning
from automation towards general embodied Artificial Intelligence (AI). Adopting foundation …

Language-conditioned learning for robotic manipulation: A survey

H Zhou, X Yao, Y Meng, S Sun, Z BIng, K Huang… - arXiv preprint arXiv …, 2023 - arxiv.org
Language-conditioned robotic manipulation represents a cutting-edge area of research,
enabling seamless communication and cooperation between humans and robotic agents …

Rl-vlm-f: Reinforcement learning from vision language foundation model feedback

Y Wang, Z Sun, J Zhang, Z Xian, E Biyik, D Held… - arXiv preprint arXiv …, 2024 - arxiv.org
Reward engineering has long been a challenge in Reinforcement Learning (RL) research,
as it often requires extensive human effort and iterative processes of trial-and-error to design …

GenH2R: Learning Generalizable Human-to-Robot Handover via Scalable Simulation Demonstration and Imitation

Z Wang, J Chen, Z Chen, P Xie… - Proceedings of the …, 2024 - openaccess.thecvf.com
This paper presents GenH2R a framework for learning generalizable vision-based human-to-
robot (H2R) handover skills. The goal is to equip robots with the ability to reliably receive …

Roboscript: Code generation for free-form manipulation tasks across real and simulation

J Chen, Y Mu, Q Yu, T Wei, S Wu, Z Yuan… - arXiv preprint arXiv …, 2024 - arxiv.org
Rapid progress in high-level task planning and code generation for open-world robot
manipulation has been witnessed in Embodied AI. However, previous studies put much …

View selection for 3d captioning via diffusion ranking

T Luo, J Johnson, H Lee - arXiv preprint arXiv:2404.07984, 2024 - arxiv.org
Scalable annotation approaches are crucial for constructing extensive 3D-text datasets,
facilitating a broader range of applications. However, existing methods sometimes lead to …

Learning Reward for Robot Skills Using Large Language Models via Self-Alignment

Y Zeng, Y Mu, L Shao - arXiv preprint arXiv:2405.07162, 2024 - arxiv.org
Learning reward functions remains the bottleneck to equip a robot with a broad repertoire of
skills. Large Language Models (LLM) contain valuable task-related knowledge that can …

Towards Unified Alignment Between Agents, Humans, and Environment

Z Yang, A Liu, Z Liu, K Liu, F Xiong, Y Wang… - arXiv preprint arXiv …, 2024 - arxiv.org
The rapid progress of foundation models has led to the prosperity of autonomous agents,
which leverage the universal capabilities of foundation models to conduct reasoning …

Towards Efficient LLM Grounding for Embodied Multi-Agent Collaboration

Y Zhang, S Yang, C Bai, F Wu, X Li, X Li… - arXiv preprint arXiv …, 2024 - arxiv.org
Grounding the reasoning ability of large language models (LLMs) for embodied tasks is
challenging due to the complexity of the physical world. Especially, LLM planning for multi …

[HTML][HTML] The Metaverse: Innovations and generative AI

JS Jauhiainen - International Journal of Innovation Studies, 2024 - Elsevier
Today, the Metaverse consists of various platforms, including digital twins of the physical
world as well as virtual and blended digital-material environments that offer immersive …