A survey on multimodal large language models for autonomous driving

C Cui, Y Ma, X Cao, W Ye, Y Zhou… - Proceedings of the …, 2024 - openaccess.thecvf.com
With the emergence of Large Language Models (LLMs) and Vision Foundation Models
(VFMs), multimodal AI systems benefiting from large models have the potential to equally …

Receive, reason, and react: Drive as you say, with large language models in autonomous vehicles

C Cui, Y Ma, X Cao, W Ye… - IEEE Intelligent …, 2024 - ieeexplore.ieee.org
The fusion of human-centric design and artificial intelligence capabilities has opened up
new possibilities for next-generation autonomous vehicles that go beyond traditional …

Online monocular lane mapping using catmull-rom spline

Z Qiao, Z Yu, H Yin, S Shen - 2023 IEEE/RSJ International …, 2023 - ieeexplore.ieee.org
In this study, we introduce an online monocular lane mapping approach that solely relies on
a single camera and odometry for generating spline-based maps. Our proposed technique …

MAPLM: A Real-World Large-Scale Vision-Language Benchmark for Map and Traffic Scene Understanding

X Cao, T Zhou, Y Ma, W Ye, C Cui… - Proceedings of the …, 2024 - openaccess.thecvf.com
Vision-language generative AI has demonstrated remarkable promise for empowering cross-
modal scene understanding of autonomous driving and high-definition (HD) map systems …

High‐definition map automatic annotation system based on active learning

C Zheng, X Cao, K Tang, Z Cao, E Sizikova… - AI …, 2023 - Wiley Online Library
As autonomous vehicle technology advances, high‐definition (HD) maps have become
essential for ensuring safety and navigation accuracy. However, creating HD maps with …

Cemformer: Learning to predict driver intentions from in-cabin and external cameras via spatial-temporal transformers

Y Ma, W Ye, X Cao, A Abdelraouf, K Han… - 2023 IEEE 26th …, 2023 - ieeexplore.ieee.org
Driver intention prediction seeks to anticipate drivers' actions by analyzing their behaviors
with respect to surrounding traffic environments. Existing approaches primarily focus on late …

MapDistill: Boosting Efficient Camera-based HD Map Construction via Camera-LiDAR Fusion Model Distillation

X Hao, R Li, H Zhang, D Li, R Yin, S Jung… - arXiv preprint arXiv …, 2024 - arxiv.org
Online high-definition (HD) map construction is an important and challenging task in
autonomous driving. Recently, there has been a growing interest in cost-effective multi-view …