Objaverse: A universe of annotated 3d objects

M Deitke, D Schwenk, J Salvador… - Proceedings of the …, 2023 - openaccess.thecvf.com
Massive data corpora like WebText, Wikipedia, Conceptual Captions, WebImageText, and
LAION have propelled recent dramatic progress in AI. Large neural models trained on such …

Omniobject3d: Large-vocabulary 3d object dataset for realistic perception, reconstruction and generation

T Wu, J Zhang, X Fu, Y Wang, J Ren… - Proceedings of the …, 2023 - openaccess.thecvf.com
Recent advances in modeling 3D objects mostly rely on synthetic datasets due to the lack of
large-scale real-scanned 3D databases. To facilitate the development of 3D perception …

Pla: Language-driven open-vocabulary 3d scene understanding

R Ding, J Yang, C Xue, W Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Open-vocabulary scene understanding aims to localize and recognize unseen categories
beyond the annotated label space. The recent breakthrough of 2D open-vocabulary …

Lowis3d: Language-driven open-world instance-level 3d scene understanding

R Ding, J Yang, C Xue, W Zhang… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Open-world instance-level scene understanding aims to locate and recognize unseen object
categories that are not present in the annotated dataset. This task is challenging because …

Ulip: Learning a unified representation of language, images, and point clouds for 3d understanding

L Xue, M Gao, C Xing, R Martín-Martín… - Proceedings of the …, 2023 - openaccess.thecvf.com
The recognition capabilities of current state-of-the-art 3D models are limited by datasets with
a small number of annotated data and a pre-defined set of categories. In its 2D counterpart …

Learning 3d object categories by looking around them

D Novotny, D Larlus, A Vedaldi - Proceedings of the IEEE …, 2017 - openaccess.thecvf.com
Traditional approaches for learning 3D object categories use either synthetic data or manual
supervision. In this paper, we propose a method which does not require manual annotations …

Embodiedscan: A holistic multi-modal 3d perception suite towards embodied ai

T Wang, X Mao, C Zhu, R Xu, R Lyu… - Proceedings of the …, 2024 - openaccess.thecvf.com
In the realm of computer vision and robotics embodied agents are expected to explore their
environment and carry out human instructions. This necessitates the ability to fully …

Clip-fo3d: Learning free open-world 3d scene representations from 2d dense clip

J Zhang, R Dong, K Ma - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
Training a 3D scene understanding model requires complicated human annotations, which
are laborious to collect and result in a model only encoding close-set object semantics. In …

Objaverse-xl: A universe of 10m+ 3d objects

M Deitke, R Liu, M Wallingford, H Ngo… - Advances in …, 2024 - proceedings.neurips.cc
Natural language processing and 2D vision models have attained remarkable proficiency on
many tasks primarily by escalating the scale of training data. However, 3D vision tasks have …

Scalable 3d captioning with pretrained models

T Luo, C Rockwell, H Lee… - Advances in Neural …, 2024 - proceedings.neurips.cc
We introduce Cap3D, an automatic approach for generating descriptive text for 3D objects.
This approach utilizes pretrained models from image captioning, image-text alignment, and …