Occ3d: A large-scale 3d occupancy prediction benchmark for autonomous driving

X Tian, T Jiang, L Yun, Y Mao, H Yang… - Advances in …, 2024 - proceedings.neurips.cc
Robotic perception requires the modeling of both 3D geometry and semantics. Existing
methods typically focus on estimating 3D bounding boxes, neglecting finer geometric details …

Pimae: Point cloud and image interactive masked autoencoders for 3d object detection

A Chen, K Zhang, R Zhang, Z Wang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Masked Autoencoders learn strong visual representations and achieve state-of-the-art
results in several independent modalities, yet very few works have addressed their …

Masked discrimination for self-supervised learning on point clouds

H Liu, M Cai, YJ Lee - European Conference on Computer Vision, 2022 - Springer
Masked autoencoding has achieved great success for self-supervised learning in the image
and language domains. However, mask based pretraining has yet to show benefits for point …

Cagroup3d: Class-aware grouping for 3d object detection on point clouds

H Wang, L Ding, S Dong, S Shi, A Li… - Advances in Neural …, 2022 - proceedings.neurips.cc
We present a novel two-stage fully sparse convolutional 3D object detection framework,
named CAGroup3D. Our proposed method first generates some high-quality 3D proposals …

Nerf-det: Learning geometry-aware volumetric representation for multi-view 3d object detection

C Xu, B Wu, J Hou, S Tsai, R Li… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract We present NeRF-Det, a novel method for indoor 3D detection with posed RGB
images as input. Unlike existing indoor 3D detection methods that struggle to model scene …

Mask3d: Mask transformer for 3d semantic instance segmentation

J Schult, F Engelmann, A Hermans… - … on Robotics and …, 2023 - ieeexplore.ieee.org
Modern 3D semantic instance segmentation approaches predominantly rely on specialized
voting mechanisms followed by carefully designed geometric clustering techniques. Building …

Octformer: Octree-based transformers for 3d point clouds

PS Wang - ACM Transactions on Graphics (TOG), 2023 - dl.acm.org
We propose octree-based transformers, named OctFormer, for 3D point cloud learning.
OctFormer can not only serve as a general and effective backbone for 3D point cloud …

A simple vision transformer for weakly semi-supervised 3d object detection

D Zhang, D Liang, Z Zou, J Li, X Ye… - Proceedings of the …, 2023 - openaccess.thecvf.com
Advanced 3D object detection methods usually rely on large-scale, elaborately labeled
datasets to achieve good performance. However, labeling the bounding boxes for the 3D …

Embodiedscan: A holistic multi-modal 3d perception suite towards embodied ai

T Wang, X Mao, C Zhu, R Xu, R Lyu… - Proceedings of the …, 2024 - openaccess.thecvf.com
In the realm of computer vision and robotics embodied agents are expected to explore their
environment and carry out human instructions. This necessitates the ability to fully …

Nerf-rpn: A general framework for object detection in nerfs

B Hu, J Huang, Y Liu, YW Tai… - Proceedings of the …, 2023 - openaccess.thecvf.com
This paper presents the first significant object detection framework, NeRF-RPN, which
directly operates on NeRF. Given a pre-trained NeRF model, NeRF-RPN aims to detect all …