D Bogdoll, Y Yang, JM Zöllner - arXiv preprint arXiv:2311.11762, 2023 - arxiv.org
Learning unsupervised world models for autonomous driving has the potential to improve the reasoning capabilities of today's systems dramatically. However, most work neglects the …
We introduce a self-supervised pretraining method called OccFeat for camera-only Bird's- Eye-View (BEV) segmentation networks. With OccFeat we pretrain a BEV network via …
The popular CLIP model displays impressive zero-shot capabilities thanks to its seamless interaction with arbitrary text prompts. However, its lack of spatial awareness makes it …
In recent years, autonomous driving has garnered escalating attention for its potential to relieve drivers' burdens and improve driving safety. Vision-based 3D occupancy prediction …
3D occupancy prediction based on multi-sensor fusion, crucial for a reliable autonomous driving system, enables fine-grained understanding of 3D scenes. Previous fusion-based 3D …
Vision-based occupancy prediction, also known as 3D Semantic Scene Completion (SSC), presents a significant challenge in computer vision. Previous methods, confined to onboard …
Perceiving the world as 3D occupancy supports embodied agents to avoid collision with any types of obstacle. While open-vocabulary image understanding has prospered recently, how …
3D occupancy-based perception pipeline has significantly advanced autonomous driving by capturing detailed scene descriptions and demonstrating strong generalizability across …
Semantic occupancy has recently gained significant traction as a prominent method for 3D scene representation. However, most existing camera-based methods rely on costly …