A survey of embodied ai: From simulators to research tasks

J Duan, S Yu, HL Tan, H Zhu… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
There has been an emerging paradigm shift from the era of “internet AI” to “embodied AI,”
where AI algorithms and agents no longer learn from datasets of images, videos or text …

Objaverse: A universe of annotated 3d objects

M Deitke, D Schwenk, J Salvador… - Proceedings of the …, 2023 - openaccess.thecvf.com
Massive data corpora like WebText, Wikipedia, Conceptual Captions, WebImageText, and
LAION have propelled recent dramatic progress in AI. Large neural models trained on such …

3d-llm: Injecting the 3d world into large language models

Y Hong, H Zhen, P Chen, S Zheng… - Advances in …, 2023 - proceedings.neurips.cc
Large language models (LLMs) and Vision-Language Models (VLMs) have been proved to
excel at multiple tasks, such as commonsense reasoning. Powerful as these models can be …

Challenges and solutions for autonomous ground robot scene understanding and navigation in unstructured outdoor environments: A review

L Wijayathunga, A Rassau, D Chai - Applied Sciences, 2023 - mdpi.com
The capabilities of autonomous mobile robotic systems have been steadily improving due to
recent advancements in computer science, engineering, and related disciplines such as …

Lm-nav: Robotic navigation with large pre-trained models of language, vision, and action

D Shah, B Osiński, S Levine - Conference on robot …, 2023 - proceedings.mlr.press
Goal-conditioned policies for robotic navigation can be trained on large, unannotated
datasets, providing for good generalization to real-world settings. However, particularly in …

Diffusion-based generation, optimization, and planning in 3d scenes

S Huang, Z Wang, P Li, B Jia, T Liu… - Proceedings of the …, 2023 - openaccess.thecvf.com
We introduce SceneDiffuser, a conditional generative model for 3D scene understanding.
SceneDiffuser provides a unified model for solving scene-conditioned generation …

Where are we in the search for an artificial visual cortex for embodied intelligence?

A Majumdar, K Yadav, S Arnaud, J Ma… - Advances in …, 2024 - proceedings.neurips.cc
We present the largest and most comprehensive empirical study of pre-trained visual
representations (PVRs) or visual 'foundation models' for Embodied AI. First, we curate …

Habitat 2.0: Training home assistants to rearrange their habitat

A Szot, A Clegg, E Undersander… - Advances in neural …, 2021 - proceedings.neurips.cc
Abstract We introduce Habitat 2.0 (H2. 0), a simulation platform for training virtual robots in
interactive 3D environments and complex physics-enabled scenarios. We make …

Navigating to objects in the real world

T Gervet, S Chintala, D Batra, J Malik, DS Chaplot - Science Robotics, 2023 - science.org
Semantic navigation is necessary to deploy mobile robots in uncontrolled environments
such as homes or hospitals. Many learning-based approaches have been proposed in …

The unsurprising effectiveness of pre-trained vision models for control

S Parisi, A Rajeswaran… - … on machine learning, 2022 - proceedings.mlr.press
Recent years have seen the emergence of pre-trained representations as a powerful
abstraction for AI applications in computer vision, natural language, and speech. However …