Inner monologue: Embodied reasoning through planning with language models

W Huang, F Xia, T Xiao, H Chan, J Liang… - arXiv preprint arXiv …, 2022 - arxiv.org
Recent works have shown how the reasoning capabilities of Large Language Models
(LLMs) can be applied to domains beyond natural language processing, such as planning …

Experience grounds language

Y Bisk, A Holtzman, J Thomason, J Andreas… - arXiv preprint arXiv …, 2020 - arxiv.org
Language understanding research is held back by a failure to relate language to the
physical world it describes and to the social interactions it facilitates. Despite the incredible …

Vision-and-language navigation: A survey of tasks, methods, and future directions

J Gu, E Stefani, Q Wu, J Thomason… - arXiv preprint arXiv …, 2022 - arxiv.org
A long-term goal of AI research is to build intelligent agents that can communicate with
humans in natural language, perceive the environment, and perform real-world tasks. Vision …

Core challenges in embodied vision-language planning

J Francis, N Kitamura, F Labelle, X Lu, I Navarro… - Journal of Artificial …, 2022 - jair.org
Recent advances in the areas of multimodal machine learning and artificial intelligence (AI)
have led to the development of challenging tasks at the intersection of Computer Vision …

Learning language-conditioned robot behavior from offline data and crowd-sourced annotation

S Nair, E Mitchell, K Chen… - Conference on Robot …, 2022 - proceedings.mlr.press
We study the problem of learning a range of vision-based manipulation tasks from a large
offline dataset of robot interaction. In order to accomplish this, humans need easy and …

Correcting robot plans with natural language feedback

P Sharma, B Sundaralingam, V Blukis, C Paxton… - arXiv preprint arXiv …, 2022 - arxiv.org
When humans design cost or goal specifications for robots, they often produce specifications
that are ambiguous, underspecified, or beyond planners' ability to solve. In these cases …

A persistent spatial semantic representation for high-level natural language instruction execution

V Blukis, C Paxton, D Fox, A Garg… - Conference on Robot …, 2022 - proceedings.mlr.press
Natural language provides an accessible and expressive interface to specify long-term tasks
for robotic agents. However, non-experts are likely to specify such tasks with high-level …

Robotic skill acquisition via instruction augmentation with vision-language models

T Xiao, H Chan, P Sermanet, A Wahid… - arXiv preprint arXiv …, 2022 - arxiv.org
In recent years, much progress has been made in learning robotic manipulation policies that
follow natural language instructions. Such methods typically learn from corpora of robot …

Aerialvln: Vision-and-language navigation for uavs

S Liu, H Zhang, Y Qi, P Wang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract Recently emerged Vision-and-Language Navigation (VLN) tasks have drawn
significant attention in both computer vision and natural language processing communities …

Sim-to-real transfer for vision-and-language navigation

P Anderson, A Shrivastava, J Truong… - … on Robot Learning, 2021 - proceedings.mlr.press
We study the challenging problem of releasing a robot in a previously unseen environment,
and having it follow unconstrained natural language navigation instructions. Recent work on …