3D object detection for autonomous driving: A comprehensive survey

J Mao, S Shi, X Wang, H Li - International Journal of Computer Vision, 2023 - Springer
Autonomous driving, in recent years, has been receiving increasing attention for its potential
to relieve drivers' burdens and improve the safety of driving. In modern autonomous driving …

Deep learning for 3d point clouds: A survey

Y Guo, H Wang, Q Hu, H Liu, L Liu… - IEEE transactions on …, 2020 - ieeexplore.ieee.org
Point cloud learning has lately attracted increasing attention due to its wide applications in
many areas, such as computer vision, autonomous driving, and robotics. As a dominating …

Transfusion: Robust lidar-camera fusion for 3d object detection with transformers

X Bai, Z Hu, X Zhu, Q Huang, Y Chen… - Proceedings of the …, 2022 - openaccess.thecvf.com
LiDAR and camera are two important sensors for 3D object detection in autonomous driving.
Despite the increasing popularity of sensor fusion in this field, the robustness against inferior …

Surface representation for point clouds

H Ran, J Liu, C Wang - … of the IEEE/CVF conference on …, 2022 - openaccess.thecvf.com
Most prior work represents the shapes of point clouds by coordinates. However, it is
insufficient to describe the local geometry directly. In this paper, we present RepSurf …

Multimodal token fusion for vision transformers

Y Wang, X Chen, L Cao, W Huang… - Proceedings of the …, 2022 - openaccess.thecvf.com
Many adaptations of transformers have emerged to address the single-modal vision tasks,
where self-attention modules are stacked to handle input sources like images. Intuitively …

Deepinteraction: 3d object detection via modality interaction

Z Yang, J Chen, Z Miao, W Li… - Advances in Neural …, 2022 - proceedings.neurips.cc
Existing top-performance 3D object detectors typically rely on the multi-modal fusion
strategy. This design is however fundamentally restricted due to overlooking the modality …

Searching efficient 3d architectures with sparse point-voxel convolution

H Tang, Z Liu, S Zhao, Y Lin, J Lin, H Wang… - European conference on …, 2020 - Springer
Self-driving cars need to understand 3D scenes efficiently and accurately in order to drive
safely. Given the limited hardware resources, existing 3D perception models are not able to …

Pointcontrast: Unsupervised pre-training for 3d point cloud understanding

S Xie, J Gu, D Guo, CR Qi, L Guibas… - Computer Vision–ECCV …, 2020 - Springer
Arguably one of the top success stories of deep learning is transfer learning. The finding that
pre-training a network on a rich source set (eg, ImageNet) can help boost performance once …

Group-free 3d object detection via transformers

Z Liu, Z Zhang, Y Cao, H Hu… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
Recently, directly detecting 3D objects from 3D point clouds has received increasing
attention. To extract object representation from an irregular point cloud, existing methods …

Open-vocabulary queryable scene representations for real world planning

B Chen, F Xia, B Ichter, K Rao… - … on Robotics and …, 2023 - ieeexplore.ieee.org
Large language models (LLMs) have unlocked new capabilities of task planning from
human instructions. However, prior attempts to apply LLMs to real-world robotic tasks are …