T Wu, J Zhang, X Fu, Y Wang, J Ren… - Proceedings of the …, 2023 - openaccess.thecvf.com
Recent advances in modeling 3D objects mostly rely on synthetic datasets due to the lack of large-scale real-scanned 3D databases. To facilitate the development of 3D perception …
R Ding, J Yang, C Xue, W Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Open-vocabulary scene understanding aims to localize and recognize unseen categories beyond the annotated label space. The recent breakthrough of 2D open-vocabulary …
R Ding, J Yang, C Xue, W Zhang… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Open-world instance-level scene understanding aims to locate and recognize unseen object categories that are not present in the annotated dataset. This task is challenging because …
The recognition capabilities of current state-of-the-art 3D models are limited by datasets with a small number of annotated data and a pre-defined set of categories. In its 2D counterpart …
Traditional approaches for learning 3D object categories use either synthetic data or manual supervision. In this paper, we propose a method which does not require manual annotations …
T Wang, X Mao, C Zhu, R Xu, R Lyu… - Proceedings of the …, 2024 - openaccess.thecvf.com
In the realm of computer vision and robotics embodied agents are expected to explore their environment and carry out human instructions. This necessitates the ability to fully …
J Zhang, R Dong, K Ma - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
Training a 3D scene understanding model requires complicated human annotations, which are laborious to collect and result in a model only encoding close-set object semantics. In …
Natural language processing and 2D vision models have attained remarkable proficiency on many tasks primarily by escalating the scale of training data. However, 3D vision tasks have …
We introduce Cap3D, an automatic approach for generating descriptive text for 3D objects. This approach utilizes pretrained models from image captioning, image-text alignment, and …