We show that large language models (LLMs) can be adapted to be generalizable policies for embodied visual tasks. Our approach, called Large LAnguage model Reinforcement …
Humans use different modalities, such as speech, text, images, videos, etc., to communicate their intent and goals with teammates. For robots to become better assistants, we aim to …
The ability to learn and refine behavior after deployment has become ever more important for robots as we design them to operate in unstructured environments like households. In …
This review paper provides a comprehensive analysis of over 100 research papers focused on the challenges of robotic grasping and the effectiveness of various machine learning …
X Sun, P Chen, J Fan, J Chen, T Li… - Advances in Neural …, 2024 - proceedings.neurips.cc
Learning to navigate to an image-specified goal is an important but challenging task for autonomous systems like household robots. The agent is required to well understand and …
The main challenge in learning image-conditioned robotic policies is acquiring a visual representation conducive to low-level control. Due to the high dimensionality of the image …
For intelligent agents (eg robots) to be seamlessly integrated into human society, humans must be able to understand their decision making. For example, the decision making of …
People often learn categories through interaction with knowledgeable others who may use verbal explanations, visual exemplars, or both, to share their knowledge. Verbal and …