J Huang, Y Ye, Z Liang, Y Shan, D Du - European Conference on …, 2025 - Springer
Abstract 3D object Detection with LiDAR-camera encounters overfitting in algorithm development derived from violating some fundamental rules. We refer to the data annotation …
Y Lei, Z Wang, F Chen, G Wang, P Wang… - arXiv preprint arXiv …, 2023 - arxiv.org
Multi-modal 3D scene understanding has gained considerable attention due to its wide applications in many areas, such as autonomous driving and human-computer interaction …
Bird's eye view (BEV) representation has emerged as a dominant solution for describing 3D space in autonomous driving scenarios. However objects in the BEV representation typically …
This paper proposes a simple, yet effective framework, called GiT, simultaneously applicable for various vision tasks only with a vanilla ViT. Motivated by the universality of the Multi-layer …
Sparse 3D detectors have received significant attention since the query-based paradigm embraces low latency without explicit dense BEV feature construction. However, these …
H Yang, H Wang, D Dai… - Advances in Neural …, 2024 - proceedings.neurips.cc
Pre-training is crucial in 3D-related fields such as autonomous driving where point cloud annotation is costly and challenging. Many recent studies on point cloud pre-training …
J Gunn, Z Lenyk, A Sharma, A Donati… - Proceedings of the …, 2024 - openaccess.thecvf.com
Combining complementary sensor modalities is crucial to providing robust perception for safety-critical robotics applications such as autonomous driving (AD). Recent state-of-the-art …
Multimodal sensor fusion is an essential capability for autonomous robots, enabling object detection and decision-making in the presence of failing or uncertain inputs. While recent …
Masked Autoencoders (MAE) play a pivotal role in learning potent representations, delivering outstanding results across various 3D perception tasks essential for autonomous …