A survey on multimodal large language models for autonomous driving

C Cui, Y Ma, X Cao, W Ye, Y Zhou… - Proceedings of the …, 2024 - openaccess.thecvf.com
With the emergence of Large Language Models (LLMs) and Vision Foundation Models
(VFMs), multimodal AI systems benefiting from large models have the potential to equally …

End-to-end autonomous driving: Challenges and frontiers

L Chen, P Wu, K Chitta, B Jaeger, A Geiger… - arXiv preprint arXiv …, 2023 - arxiv.org
The autonomous driving community has witnessed a rapid growth in approaches that
embrace an end-to-end algorithm framework, utilizing raw sensor input to generate vehicle …

Language-conditioned learning for robotic manipulation: A survey

H Zhou, X Yao, Y Meng, S Sun, Z BIng, K Huang… - arXiv preprint arXiv …, 2023 - arxiv.org
Language-conditioned robotic manipulation represents a cutting-edge area of research,
enabling seamless communication and cooperation between humans and robotic agents …

Dme-driver: Integrating human decision logic and 3d scene perception in autonomous driving

W Han, D Guo, CZ Xu, J Shen - arXiv preprint arXiv:2401.03641, 2024 - arxiv.org
In the field of autonomous driving, two important features of autonomous driving car systems
are the explainability of decision logic and the accuracy of environmental perception. This …

Ground then navigate: Language-guided navigation in dynamic scenes

K Jain, V Chhangani, A Tiwari… - … on Robotics and …, 2023 - ieeexplore.ieee.org
We investigate the Vision-and-Language Navigation (VLN) problem in the context of
autonomous driving in outdoor settings. We solve the problem by explicitly grounding the …

Vision language models in autonomous driving: A survey and outlook

X Zhou, M Liu, E Yurtsever, BL Zagar… - IEEE Transactions …, 2024 - ieeexplore.ieee.org
The applications of Vision-Language Models (VLMs) in the field of Autonomous Driving (AD)
have attracted widespread attention due to their outstanding performance and the ability to …

LASO: Language-guided Affordance Segmentation on 3D Object

Y Li, N Zhao, J Xiao, C Feng… - Proceedings of the …, 2024 - openaccess.thecvf.com
Segmenting affordance in 3D data is key for bridging perception and action in robots.
Existing efforts mostly focus on the visual side and overlook the affordance knowledge from …

Prospective Role of Foundation Models in Advancing Autonomous Vehicles

J Wu, B Gao, J Gao, J Yu, H Chu, Q Yu, X Gong… - Research, 2024 - spj.science.org
With the development of artificial intelligence and breakthroughs in deep learning, large-
scale foundation models (FMs), such as generative pre-trained transformer (GPT), Sora, etc …

Applications of large scale foundation models for autonomous driving

Y Huang, Y Chen, Z Li - arXiv preprint arXiv:2311.12144, 2023 - arxiv.org
Since DARPA Grand Challenges (rural) in 2004/05 and Urban Challenges in 2007,
autonomous driving has been the most active field of AI applications. Recently powered by …

Talk2Car: Predicting physical trajectories for natural language commands

T Deruyttere, D Grujicic, MB Blaschko, MF Moens - Ieee Access, 2022 - ieeexplore.ieee.org
In recent years, there has been an increased interest in giving verbal commands to self-
driving cars. Even though multiple companies have showcased progress towards fully …