- 学术资源搜索

A comprehensive survey on pretrained foundation models: A history from bert to chatgpt

C Zhou, Q Li, C Li, J Yu, Y Liu, G Wang… - arXiv preprint arXiv …, 2023 - arxiv.org

Pretrained Foundation Models (PFMs) are regarded as the foundation for various
downstream tasks with different data modalities. A PFM (eg, BERT, ChatGPT, and GPT-4) is …

被引用次数：413 相关文章所有 2 个版本

[PDF] arxiv.org

3D object detection for autonomous driving: A comprehensive survey

J Mao, S Shi, X Wang, H Li - International Journal of Computer Vision, 2023 - Springer

Autonomous driving, in recent years, has been receiving increasing attention for its potential
to relieve drivers' burdens and improve the safety of driving. In modern autonomous driving …

被引用次数：100 相关文章所有 8 个版本

[PDF] thecvf.com

Voxelnext: Fully sparse voxelnet for 3d object detection and tracking

Y Chen, J Liu, X Zhang, X Qi… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Abstract 3D object detectors usually rely on hand-crafted proxies, eg, anchors or centers,
and translate well-studied 2D frameworks to 3D. Thus, sparse voxel features need to be …

被引用次数：145 相关文章所有 6 个版本

[PDF] neurips.cc

Multimodal virtual point 3d detection

T Yin, X Zhou, P Krähenbühl - Advances in Neural …, 2021 - proceedings.neurips.cc

Lidar-based sensing drives current autonomous vehicles. Despite rapid progress, current
Lidar sensors still lag two decades behind traditional color cameras in terms of resolution …

被引用次数：208 相关文章所有 8 个版本

[PDF] arxiv.org

Centerformer: Center-based transformer for 3d object detection

Z Zhou, X Zhao, Y Wang, P Wang… - European Conference on …, 2022 - Springer

Query-based transformer has shown great potential in constructing long-range attention in
many image-domain tasks, but has rarely been considered in LiDAR-based 3D object …

被引用次数：119 相关文章所有 6 个版本

[PDF] thecvf.com

Logonet: Towards accurate 3d object detection with local-to-global cross-modal fusion

X Li, T Ma, Y Hou, B Shi, Y Yang… - Proceedings of the …, 2023 - openaccess.thecvf.com

LiDAR-camera fusion methods have shown impressive performance in 3D object detection.
Recent advanced multi-modal methods mainly perform global fusion, where image features …

被引用次数：67 相关文章所有 6 个版本

[PDF] thecvf.com

Dsvt: Dynamic sparse voxel transformer with rotated sets

H Wang, C Shi, S Shi, M Lei, S Wang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Designing an efficient yet deployment-friendly 3D backbone to handle sparse point clouds is
a fundamental problem in 3D perception. Compared with the customized sparse …

被引用次数：61 相关文章所有 6 个版本

[PDF] arxiv.org

Persformer: 3d lane detection via perspective transformer and the openlane benchmark

L Chen, C Sima, Y Li, Z Zheng, J Xu, X Geng… - … on Computer Vision, 2022 - Springer

Methods for 3D lane detection have been recently proposed to address the issue of
inaccurate lane layouts in many autonomous driving scenarios (uphill/downhill, bump, etc.) …

被引用次数：126 相关文章所有 4 个版本

[PDF] thecvf.com

Flatformer: Flattened window attention for efficient point cloud transformer

Z Liu, X Yang, H Tang, S Yang… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Transformer, as an alternative to CNN, has been proven effective in many modalities (eg,
texts and images). For 3D point cloud transformers, existing efforts focus primarily on …

被引用次数：47 相关文章所有 6 个版本

[PDF] thecvf.com

Cat-det: Contrastively augmented transformer for multi-modal 3d object detection

Y Zhang, J Chen, D Huang - Proceedings of the IEEE/CVF …, 2022 - openaccess.thecvf.com

In autonomous driving, LiDAR point-clouds and RGB images are two major data modalities
with complementary cues for 3D object detection. However, it is quite difficult to sufficiently …

被引用次数：102 相关文章所有 5 个版本

高级搜索

QQ 群

A comprehensive survey on pretrained foundation models: A history from bert to chatgpt

3D object detection for autonomous driving: A comprehensive survey

Voxelnext: Fully sparse voxelnet for 3d object detection and tracking

Multimodal virtual point 3d detection

Centerformer: Center-based transformer for 3d object detection

Logonet: Towards accurate 3d object detection with local-to-global cross-modal fusion

Dsvt: Dynamic sparse voxel transformer with rotated sets

Persformer: 3d lane detection via perspective transformer and the openlane benchmark

Flatformer: Flattened window attention for efficient point cloud transformer

Cat-det: Contrastively augmented transformer for multi-modal 3d object detection

引用