A survey on multimodal large language models for autonomous driving

C Cui, Y Ma, X Cao, W Ye, Y Zhou… - Proceedings of the …, 2024 - openaccess.thecvf.com
With the emergence of Large Language Models (LLMs) and Vision Foundation Models
(VFMs), multimodal AI systems benefiting from large models have the potential to equally …

Lampilot: An open benchmark dataset for autonomous driving with language model programs

Y Ma, C Cui, X Cao, W Ye, P Liu, J Lu… - Proceedings of the …, 2024 - openaccess.thecvf.com
Autonomous driving (AD) has made significant strides in recent years. However existing
frameworks struggle to interpret and execute spontaneous user instructions such as" …

Detgpt: Detect what you need via reasoning

R Pi, J Gao, S Diao, R Pan, H Dong, J Zhang… - arXiv preprint arXiv …, 2023 - arxiv.org
In recent years, the field of computer vision has seen significant advancements thanks to the
development of large language models (LLMs). These models have enabled more effective …

Holistic Autonomous Driving Understanding by Bird's-Eye-View Injected Multi-Modal Large Models

X Ding, J Han, H Xu, X Liang… - Proceedings of the …, 2024 - openaccess.thecvf.com
The rise of multimodal large language models (MLLMs) has spurred interest in language-
based driving tasks. However existing research typically focuses on limited tasks and often …

Perceptiongpt: Effectively fusing visual perception into llm

R Pi, L Yao, J Gao, J Zhang… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
The integration of visual inputs with large language models (LLMs) has led to remarkable
advancements in multi-modal capabilities giving rise to vision large language models …

Dolphins: Multimodal language model for driving

Y Ma, Y Cao, J Sun, M Pavone, C Xiao - arXiv preprint arXiv:2312.00438, 2023 - arxiv.org
The quest for fully autonomous vehicles (AVs) capable of navigating complex real-world
scenarios with human-like understanding and responsiveness. In this paper, we introduce …

Human-centric autonomous systems with llms for user command reasoning

Y Yang, Q Zhang, C Li, DS Marta… - Proceedings of the …, 2024 - openaccess.thecvf.com
The evolution of autonomous driving has made remarkable advancements in recent years,
evolving into a tangible reality. However, a human-centric large-scale adoption hinges on …

MLLM-Protector: Ensuring MLLM's Safety without Hurting Performance

R Pi, T Han, Y Xie, R Pan, Q Lian, H Dong… - arXiv preprint arXiv …, 2024 - arxiv.org
The deployment of multimodal large language models (MLLMs) has brought forth a unique
vulnerability: susceptibility to malicious attacks through visual inputs. We delve into the novel …

Empowering autonomous driving with large language models: A safety perspective

Y Wang, R Jiao, C Lang, SS Zhan, C Huang… - arXiv preprint arXiv …, 2023 - arxiv.org
Autonomous Driving (AD) faces crucial hurdles for commercial launch, notably in the form of
diminished public trust and safety concerns from long-tail unforeseen driving scenarios. This …

Embodied understanding of driving scenarios

Y Zhou, L Huang, Q Bu, J Zeng, T Li, H Qiu… - arXiv preprint arXiv …, 2024 - arxiv.org
Embodied scene understanding serves as the cornerstone for autonomous agents to
perceive, interpret, and respond to open driving scenarios. Such understanding is typically …