Massive data corpora like WebText, Wikipedia, Conceptual Captions, WebImageText, and LAION have propelled recent dramatic progress in AI. Large neural models trained on such …
Large language models (LLMs) and Vision-Language Models (VLMs) have been proved to excel at multiple tasks, such as commonsense reasoning. Powerful as these models can be …
The capabilities of autonomous mobile robotic systems have been steadily improving due to recent advancements in computer science, engineering, and related disciplines such as …
Goal-conditioned policies for robotic navigation can be trained on large, unannotated datasets, providing for good generalization to real-world settings. However, particularly in …
We introduce SceneDiffuser, a conditional generative model for 3D scene understanding. SceneDiffuser provides a unified model for solving scene-conditioned generation …
We present the largest and most comprehensive empirical study of pre-trained visual representations (PVRs) or visual 'foundation models' for Embodied AI. First, we curate …
A Szot, A Clegg, E Undersander… - Advances in neural …, 2021 - proceedings.neurips.cc
Abstract We introduce Habitat 2.0 (H2. 0), a simulation platform for training virtual robots in interactive 3D environments and complex physics-enabled scenarios. We make …
Semantic navigation is necessary to deploy mobile robots in uncontrolled environments such as homes or hospitals. Many learning-based approaches have been proposed in …
Recent years have seen the emergence of pre-trained representations as a powerful abstraction for AI applications in computer vision, natural language, and speech. However …