Probing Multimodal LLMs as World Models for Driving

S Sreeram, TH Wang, A Maalouf, G Rosman… - arXiv preprint arXiv …, 2024 - arxiv.org
We provide a sober look at the application of Multimodal Large Language Models (MLLMs)
within the domain of autonomous driving and challenge/verify some common assumptions …

A survey on multimodal large language models for autonomous driving

C Cui, Y Ma, X Cao, W Ye, Y Zhou… - Proceedings of the …, 2024 - openaccess.thecvf.com
With the emergence of Large Language Models (LLMs) and Vision Foundation Models
(VFMs), multimodal AI systems benefiting from large models have the potential to equally …

LimSim++: A Closed-Loop Platform for Deploying Multimodal LLMs in Autonomous Driving

D Fu, W Lei, L Wen, P Cai, S Mao, M Dou, B Shi… - arXiv preprint arXiv …, 2024 - arxiv.org
The emergence of Multimodal Large Language Models ((M) LLMs) has ushered in new
avenues in artificial intelligence, particularly for autonomous driving by offering enhanced …

Drivegpt4: Interpretable end-to-end autonomous driving via large language model

Z Xu, Y Zhang, E Xie, Z Zhao, Y Guo, KKY Wong… - arXiv preprint arXiv …, 2023 - arxiv.org
In the past decade, autonomous driving has experienced rapid development in both
academia and industry. However, its limited interpretability remains a significant unsolved …

Hilm-d: Towards high-resolution understanding in multimodal large language models for autonomous driving

X Ding, J Han, H Xu, W Zhang, X Li - arXiv preprint arXiv:2309.05186, 2023 - arxiv.org
Autonomous driving systems generally employ separate models for different tasks resulting
in intricate designs. For the first time, we leverage singular multimodal large language …

Drivemlm: Aligning multi-modal large language models with behavioral planning states for autonomous driving

W Wang, J Xie, CY Hu, H Zou, J Fan, W Tong… - arXiv preprint arXiv …, 2023 - arxiv.org
Large language models (LLMs) have opened up new possibilities for intelligent agents,
endowing them with human-like thinking and cognitive abilities. In this work, we delve into …

Adriver-i: A general world model for autonomous driving

F Jia, W Mao, Y Liu, Y Zhao, Y Wen, C Zhang… - arXiv preprint arXiv …, 2023 - arxiv.org
Typically, autonomous driving adopts a modular design, which divides the full stack into
perception, prediction, planning and control parts. Though interpretable, such modular …

OmniDrive: A Holistic LLM-Agent Framework for Autonomous Driving with 3D Perception, Reasoning and Planning

S Wang, Z Yu, X Jiang, S Lan, M Shi, N Chang… - arXiv preprint arXiv …, 2024 - arxiv.org
The advances in multimodal large language models (MLLMs) have led to growing interests
in LLM-based autonomous driving agents to leverage their strong reasoning capabilities …

Driving with llms: Fusing object-level vector modality for explainable autonomous driving

L Chen, O Sinavski, J Hünermann, A Karnsund… - arXiv preprint arXiv …, 2023 - arxiv.org
Large Language Models (LLMs) have shown promise in the autonomous driving sector,
particularly in generalization and interpretability. We introduce a unique object-level …

On the road with gpt-4v (ision): Early explorations of visual-language model on autonomous driving

L Wen, X Yang, D Fu, X Wang, P Cai, X Li, T Ma… - arXiv preprint arXiv …, 2023 - arxiv.org
The pursuit of autonomous driving technology hinges on the sophisticated integration of
perception, decision-making, and control systems. Traditional approaches, both data-driven …