Time will tell: New outlooks and a baseline for temporal multi-view 3d object detection

J Park, C Xu, S Yang, K Keutzer, KM Kitani… - The Eleventh …, 2022 - openreview.net
While recent camera-only 3D detection methods leverage multiple timesteps, the limited
history they use significantly hampers the extent to which temporal fusion can improve object …

Bevdet4d: Exploit temporal cues in multi-camera 3d object detection

J Huang, G Huang - arXiv preprint arXiv:2203.17054, 2022 - arxiv.org
Single frame data contains finite information which limits the performance of the existing
vision-based multi-camera 3D object detection paradigms. For fundamentally pushing the …

Bevdistill: Cross-modal bev distillation for multi-view 3d object detection

Z Chen, Z Li, S Zhang, L Fang, Q Jiang… - arXiv preprint arXiv …, 2022 - arxiv.org
3D object detection from multiple image views is a fundamental and challenging task for
visual scene understanding. Owing to its low cost and high efficiency, multi-view 3D object …

Autoalign: Pixel-instance feature aggregation for multi-modal 3d object detection

Z Chen, Z Li, S Zhang, L Fang, Q Jiang, F Zhao… - arXiv preprint arXiv …, 2022 - arxiv.org
Object detection through either RGB images or the LiDAR point clouds has been extensively
explored in autonomous driving. However, it remains challenging to make these two data …

Distillbev: Boosting multi-camera 3d object detection with cross-modal knowledge distillation

Z Wang, D Li, C Luo, C Xie… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Abstract 3D perception based on the representations learned from multi-camera bird's-eye-
view (BEV) is trending as cameras are cost-effective for mass production in autonomous …

Vista: Boosting 3d object detection via dual cross-view spatial attention

S Deng, Z Liang, L Sun, K Jia - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com
Detecting objects from LiDAR point clouds is of tremendous significance in autonomous
driving. In spite of good progress, accurate and reliable 3D detection is yet to be achieved …

Exploring data augmentation for multi-modality 3d object detection

W Zhang, Z Wang, CC Loy - arXiv preprint arXiv:2012.12741, 2020 - arxiv.org
It is counter-intuitive that multi-modality methods based on point cloud and images perform
only marginally better or sometimes worse than approaches that solely use point cloud. This …

Exploring object-centric temporal modeling for efficient multi-view 3d object detection

S Wang, Y Liu, T Wang, Y Li… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
In this paper, we propose a long-sequence modeling framework, named StreamPETR, for
multi-view 3D object detection. Built upon the sparse query design in the PETR series, we …

Nerf-det: Learning geometry-aware volumetric representation for multi-view 3d object detection

C Xu, B Wu, J Hou, S Tsai, R Li… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract We present NeRF-Det, a novel method for indoor 3D detection with posed RGB
images as input. Unlike existing indoor 3D detection methods that struggle to model scene …

Sparsefusion: Fusing multi-modal sparse representations for multi-sensor 3d object detection

Y Xie, C Xu, MJ Rakotosaona, P Rim… - Proceedings of the …, 2023 - openaccess.thecvf.com
By identifying four important components of existing LiDAR-camera 3D object detection
methods (LiDAR and camera candidates, transformation, and fusion outputs), we observe …