Point transformer v2: Grouped vector attention and partition-based pooling

X Wu, Y Lao, L Jiang, X Liu… - Advances in Neural …, 2022 - proceedings.neurips.cc
As a pioneering work exploring transformer architecture for 3D point cloud understanding,
Point Transformer achieves impressive results on multiple highly competitive benchmarks. In …

Stratified transformer for 3d point cloud segmentation

X Lai, J Liu, L Jiang, L Wang, H Zhao… - Proceedings of the …, 2022 - openaccess.thecvf.com
Abstract 3D point cloud segmentation has made tremendous progress in recent years. Most
current methods focus on aggregating local features, but fail to directly model long-range …

Omniobject3d: Large-vocabulary 3d object dataset for realistic perception, reconstruction and generation

T Wu, J Zhang, X Fu, Y Wang, J Ren… - Proceedings of the …, 2023 - openaccess.thecvf.com
Recent advances in modeling 3D objects mostly rely on synthetic datasets due to the lack of
large-scale real-scanned 3D databases. To facilitate the development of 3D perception …

Crosspoint: Self-supervised cross-modal contrastive learning for 3d point cloud understanding

M Afham, I Dissanayake… - Proceedings of the …, 2022 - openaccess.thecvf.com
Manual annotation of large-scale point cloud dataset for varying tasks such as 3D object
classification, segmentation and detection is often laborious owing to the irregular structure …

Rethinking network design and local geometry in point cloud: A simple residual MLP framework

X Ma, C Qin, H You, H Ran, Y Fu - arXiv preprint arXiv:2202.07123, 2022 - arxiv.org
Point cloud analysis is challenging due to irregularity and unordered data structure. To
capture the 3D geometries, prior works mainly rely on exploring sophisticated local …

Point Transformer V3: Simpler Faster Stronger

X Wu, L Jiang, PS Wang, Z Liu, X Liu… - Proceedings of the …, 2024 - openaccess.thecvf.com
This paper is not motivated to seek innovation within the attention mechanism. Instead it
focuses on overcoming the existing trade-offs between accuracy and efficiency within the …

Learning 3d representations from 2d pre-trained models via image-to-point masked autoencoders

R Zhang, L Wang, Y Qiao, P Gao… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Pre-training by numerous image data has become de-facto for robust 2D representations. In
contrast, due to the expensive data processing, a paucity of 3D datasets severely hinders …

Mvimgnet: A large-scale dataset of multi-view images

X Yu, M Xu, Y Zhang, H Liu, C Ye… - Proceedings of the …, 2023 - openaccess.thecvf.com
Being data-driven is one of the most iconic properties of deep learning algorithms. The birth
of ImageNet drives a remarkable trend of" learning from large-scale data" in computer vision …

Surface representation for point clouds

H Ran, J Liu, C Wang - … of the IEEE/CVF conference on …, 2022 - openaccess.thecvf.com
Most prior work represents the shapes of point clouds by coordinates. However, it is
insufficient to describe the local geometry directly. In this paper, we present RepSurf …

Pla: Language-driven open-vocabulary 3d scene understanding

R Ding, J Yang, C Xue, W Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Open-vocabulary scene understanding aims to localize and recognize unseen categories
beyond the annotated label space. The recent breakthrough of 2D open-vocabulary …