Gpt-4v (ision) for robotics: Multimodal task planning from human demonstration

N Wake, A Kanehira, K Sasabuchi, J Takamatsu… - arXiv preprint arXiv …, 2023 - arxiv.org
We introduce a pipeline that enhances a general-purpose Vision Language Model, GPT-4V
(ision), by integrating observations of human actions to facilitate robotic manipulation. This …

Inferring goals with gaze during teleoperated manipulation

RM Aronson, N Almutlak… - 2021 IEEE/RSJ …, 2021 - ieeexplore.ieee.org
Assistive robot manipulators help people with upper motor impairments perform tasks by
themselves. However, teleoperating a robot to perform complex tasks is difficult. Shared …

Voxposer: Composable 3d value maps for robotic manipulation with language models

W Huang, C Wang, R Zhang, Y Li, J Wu… - arXiv preprint arXiv …, 2023 - arxiv.org
Large language models (LLMs) are shown to possess a wealth of actionable knowledge that
can be extracted for robot manipulation in the form of reasoning and planning. Despite the …

Surrogate assisted generation of human-robot interaction scenarios

V Bhatt, H Nemlekar, MC Fontaine, B Tjanaka… - arXiv preprint arXiv …, 2023 - arxiv.org
As human-robot interaction (HRI) systems advance, so does the difficulty of evaluating and
understanding the strengths and limitations of these systems in different environments and …

Modeling long-horizon tasks as sequential interaction landscapes

S Pirk, K Hausman, A Toshev, M Khansari - arXiv preprint arXiv …, 2020 - arxiv.org
Complex object manipulation tasks often span over long sequences of operations. Task
planning over long-time horizons is a challenging and open problem in robotics, and its …

Any-point trajectory modeling for policy learning

C Wen, X Lin, J So, K Chen, Q Dou, Y Gao… - arXiv preprint arXiv …, 2023 - arxiv.org
Learning from demonstration is a powerful method for teaching robots new skills, and more
demonstration data often improves policy learning. However, the high cost of collecting …

Structured world models from human videos

R Mendonca, S Bahl, D Pathak - arXiv preprint arXiv:2308.10901, 2023 - arxiv.org
We tackle the problem of learning complex, general behaviors directly in the real world. We
propose an approach for robots to efficiently learn manipulation skills using only a handful of …

Auto-conditioned recurrent mixture density networks for learning generalizable robot skills

H Zhang, E Heiden, S Nikolaidis, JJ Lim… - arXiv preprint arXiv …, 2018 - arxiv.org
Personal robots assisting humans must perform complex manipulation tasks that are
typically difficult to specify in traditional motion planning pipelines, where multiple objectives …

Telemanipulation via virtual reality interfaces with enhanced environment models

M Wonsick, T Keleștemur, S Alt… - 2021 IEEE/RSJ …, 2021 - ieeexplore.ieee.org
Extreme environments, such as search and rescue missions, defusing bombs, or exploring
extraterrestrial planets, are unsafe environments for humans to be in. Robots enable …

Squirl: Robust and efficient learning from video demonstration of long-horizon robotic manipulation tasks

B Wu, F Xu, Z He, A Gupta… - 2020 IEEE/RSJ …, 2020 - ieeexplore.ieee.org
Recent advances in deep reinforcement learning (RL) have demonstrated its potential to
learn complex robotic manipulation tasks. However, RL still requires the robot to collect a …