You only group once: Efficient point-cloud processing with token representation and relation...

Z Kong, P Dong, X Ma, X Meng, W Niu, M Sun… - European conference on …, 2022 - Springer

Abstract Recently, Vision Transformer (ViT) has continuously established new milestones in
the computer vision field, while the high computation and memory cost makes its …

被引用次数：123 相关文章所有 6 个版本

[PDF] arxiv.org

Image2point: 3d point-cloud understanding with 2d image pretrained models

C Xu, S Yang, T Galanti, B Wu, X Yue, B Zhai… - … on Computer Vision, 2022 - Springer

Abstract 3D point-clouds and 2D images are different visual representations of the physical
world. While human vision can understand both representations, computer vision models …

被引用次数：75 相关文章所有 7 个版本

[PDF] arxiv.org

Lcpformer: Towards effective 3d point cloud analysis via local context propagation in transformers

Z Huang, Z Zhao, B Li, J Han - IEEE Transactions on Circuits …, 2023 - ieeexplore.ieee.org

Transformer with its underlying attention mechanism and the ability to capture long-range
dependencies makes it become a natural choice for unordered point cloud data. However …

被引用次数：36 相关文章所有 8 个版本

[PDF] thecvf.com

Irisformer: Dense vision transformers for single-image inverse rendering in indoor scenes

R Zhu, Z Li, J Matai, F Porikli… - Proceedings of the …, 2022 - openaccess.thecvf.com

Indoor scenes exhibit significant appearance variations due to myriad interactions between
arbitrarily diverse object shapes, spatially-changing materials, and complex lighting …

被引用次数：32 相关文章所有 11 个版本

[PDF] thecvf.com

Delflow: Dense efficient learning of scene flow for large-scale point clouds

C Peng, G Wang, XW Lo, X Wu, C Xu… - Proceedings of the …, 2023 - openaccess.thecvf.com

Point clouds are naturally sparse, while image pixels are dense. The inconsistency limits
feature fusion from both modalities for point-wise scene flow estimation. Previous methods …

被引用次数：6 相关文章所有 5 个版本

[PDF] arxiv.org

Detmatch: Two teachers are better than one for joint 2d and 3d semi-supervised object detection

J Park, C Xu, Y Zhou, M Tomizuka, W Zhan - European Conference on …, 2022 - Springer

While numerous 3D detection works leverage the complementary relationship between RGB
images and point clouds, developments in the broader framework of semi-supervised object …

被引用次数：24 相关文章所有 5 个版本

[PDF] thecvf.com

Visual transformers: Where do transformers really belong in vision models?

B Wu, C Xu, X Dai, A Wan, P Zhang… - Proceedings of the …, 2021 - openaccess.thecvf.com

A recent trend in computer vision is to replace convolutions with transformers. However, the
performance gain of transformers is attained at a steep cost, requiring GPU years and …

被引用次数：20 相关文章所有 4 个版本

[PDF] arxiv.org

Open-vocabulary 3d detection via image-level class and debiased cross-modal contrastive learning

Y Lu, C Xu, X Wei, X Xie, M Tomizuka… - arXiv preprint arXiv …, 2022 - arxiv.org

Current point-cloud detection methods have difficulty detecting the open-vocabulary objects
in the real world, due to their limited generalization capability. Moreover, it is extremely …

被引用次数：11 相关文章所有 3 个版本

[PDF] arxiv.org

Collect-and-distribute transformer for 3d point cloud analysis

H Qiu, B Yu, D Tao - arXiv preprint arXiv:2306.01257, 2023 - arxiv.org

Remarkable advancements have been made recently in point cloud analysis through the
exploration of transformer architecture, but it remains challenging to effectively learn local …

被引用次数：5 相关文章所有 2 个版本

[PDF] arxiv.org

A simple and efficient multi-task network for 3d object detection and road understanding

D Feng, Y Zhou, C Xu, M Tomizuka… - 2021 IEEE/RSJ …, 2021 - ieeexplore.ieee.org

Detecting dynamic objects and predicting static road information such as drivable areas and
ground heights are crucial for safe autonomous driving. Previous works studied each …

被引用次数：18 相关文章所有 6 个版本

高级搜索

QQ 群