Prospection: Interpretable plans from language by predicting the future

C Paxton, Y Bisk, J Thomason… - … on Robotics and …, 2019 - ieeexplore.ieee.org
High-level human instructions often correspond to behaviors with multiple implicit steps. In
order for robots to be useful in the real world, they must be able to to reason over both …

Large-scale actionless video pre-training via discrete diffusion for efficient policy learning

H He, C Bai, L Pan, W Zhang, B Zhao, X Li - arXiv preprint arXiv …, 2024 - arxiv.org
Learning a generalist embodied agent capable of completing multiple tasks poses
challenges, primarily stemming from the scarcity of action-labeled robotic datasets. In …

State-only imitation learning for dexterous manipulation

I Radosavovic, X Wang, L Pinto… - 2021 IEEE/RSJ …, 2021 - ieeexplore.ieee.org
Modern model-free reinforcement learning methods have recently demonstrated impressive
results on a number of problems. However, complex domains like dexterous manipulation …

Learning a universal human prior for dexterous manipulation from human preference

Z Ding, Y Chen, AZ Ren, SS Gu, Q Wang… - arXiv preprint arXiv …, 2023 - arxiv.org
Generating human-like behavior on robots is a great challenge especially in dexterous
manipulation tasks with robotic hands. Scripting policies from scratch is intractable due to …

Learning predictive models from observation and interaction

K Schmeckpeper, A Xie, O Rybkin, S Tian… - … on Computer Vision, 2020 - Springer
Learning predictive models from interaction with the world allows an agent, such as a robot,
to learn about how the world works, and then use this learned model to plan coordinated …

Liv: Language-image representations and rewards for robotic control

YJ Ma, V Kumar, A Zhang, O Bastani… - International …, 2023 - proceedings.mlr.press
Abstract We present Language-Image Value learning (LIV), a unified objective for vision-
language representation and reward learning from action-free videos with text annotations …

Mastering robot manipulation with multimodal prompts through pretraining and multi-task fine-tuning

J Li, Q Gao, M Johnston, X Gao, X He… - arXiv preprint arXiv …, 2023 - arxiv.org
Prompt-based learning has been demonstrated as a compelling paradigm contributing to
large language models' tremendous success (LLMs). Inspired by their success in language …

[HTML][HTML] Integrated cognitive architecture for robot learning of action and language

K Miyazawa, T Horii, T Aoki, T Nagai - Frontiers in Robotics and AI, 2019 - frontiersin.org
The manner in which humans learn, plan, and decide actions is a very compelling subject.
Moreover, the mechanism behind high-level cognitive functions, such as action planning …

Plex: Making the most of the available data for robotic manipulation pretraining

G Thomas, CA Cheng, R Loynd… - … on Robot Learning, 2023 - proceedings.mlr.press
A rich representation is key to general robotic manipulation, but existing approaches to
representation learning require large amounts of multimodal demonstrations. In this work we …

Kite: Keypoint-conditioned policies for semantic manipulation

P Sundaresan, S Belkhale, D Sadigh… - arXiv preprint arXiv …, 2023 - arxiv.org
While natural language offers a convenient shared interface for humans and robots,
enabling robots to interpret and follow language commands remains a longstanding …