VLN-Video: Utilizing Driving Videos for Outdoor Vision-and-Language Navigation

J Li, A Padmakumar, G Sukhatme, M Bansal - arXiv preprint arXiv …, 2024 - arxiv.org
Outdoor Vision-and-Language Navigation (VLN) requires an agent to navigate through
realistic 3D outdoor environments based on natural language instructions. The performance …

MAPLM: A Real-World Large-Scale Vision-Language Benchmark for Map and Traffic Scene Understanding

X Cao, T Zhou, Y Ma, W Ye, C Cui… - Proceedings of the …, 2024 - openaccess.thecvf.com
Vision-language generative AI has demonstrated remarkable promise for empowering cross-
modal scene understanding of autonomous driving and high-definition (HD) map systems …

Holistic Autonomous Driving Understanding by Bird's-Eye-View Injected Multi-Modal Large Models

X Ding, J Han, H Xu, X Liang… - Proceedings of the …, 2024 - openaccess.thecvf.com
The rise of multimodal large language models (MLLMs) has spurred interest in language-
based driving tasks. However existing research typically focuses on limited tasks and often …

Drive as you speak: Enabling human-like interaction with large language models in autonomous vehicles

C Cui, Y Ma, X Cao, W Ye… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
The future of autonomous vehicles lies in the convergence of human-centric design and
advanced AI capabilities. Autonomous vehicles of the future will not only transport …

Lampilot: An open benchmark dataset for autonomous driving with language model programs

Y Ma, C Cui, X Cao, W Ye, P Liu, J Lu… - Proceedings of the …, 2024 - openaccess.thecvf.com
Autonomous driving (AD) has made significant strides in recent years. However existing
frameworks struggle to interpret and execute spontaneous user instructions such as" …

Receive, reason, and react: Drive as you say, with large language models in autonomous vehicles

C Cui, Y Ma, X Cao, W Ye… - IEEE Intelligent …, 2024 - ieeexplore.ieee.org
The fusion of human-centric design and artificial intelligence capabilities has opened up
new possibilities for next-generation autonomous vehicles that go beyond traditional …

Dolphins: Multimodal language model for driving

Y Ma, Y Cao, J Sun, M Pavone, C Xiao - arXiv preprint arXiv:2312.00438, 2023 - arxiv.org
The quest for fully autonomous vehicles (AVs) capable of navigating complex real-world
scenarios with human-like understanding and responsiveness. In this paper, we introduce …

LLM4Drive: A Survey of Large Language Models for Autonomous Driving

Z Yang, X Jia, H Li, J Yan - arXiv e-prints, 2023 - ui.adsabs.harvard.edu
Autonomous driving technology, a catalyst for revolutionizing transportation and urban
mobility, has the tend to transition from rule-based systems to data-driven strategies …

HiLM-D: Towards High-Resolution Understanding in Multimodal Large Language Models for Autonomous Driving

X Ding, J Han, H Xu, W Zhang, X Li - arXiv preprint arXiv:2309.05186, 2023 - arxiv.org
Autonomous driving systems generally employ separate models for different tasks resulting
in intricate designs. For the first time, we leverage singular multimodal large language …

DriveLLM: Charting the path toward full autonomous driving with large language models

Y Cui, S Huang, J Zhong, Z Liu, Y Wang… - IEEE Transactions …, 2023 - ieeexplore.ieee.org
Human drivers instinctively reason with commonsense knowledge to predict hazards in
unfamiliar scenarios and to understand the intentions of other road users. However, this …