相关文章- 学术资源搜索

GPT-4V Explorations: Mining Autonomous Driving

Z Li - arXiv preprint arXiv:2406.16817, 2024 - arxiv.org

This paper explores the application of the GPT-4V (ision) large visual language model to
autonomous driving in mining environments, where traditional systems often falter in …

On the road with gpt-4v (ision): Early explorations of visual-language model on autonomous driving

L Wen, X Yang, D Fu, X Wang, P Cai, X Li, T Ma… - arXiv preprint arXiv …, 2023 - arxiv.org

The pursuit of autonomous driving technology hinges on the sophisticated integration of
perception, decision-making, and control systems. Traditional approaches, both data-driven …

被引用次数：37 相关文章所有 2 个版本

[PDF] openreview.net

On the Road with GPT-4V (ision): Explorations of Utilizing Visual-Language Model as Autonomous Driving Agent

L Wen, X Yang, D Fu, X Wang, P Cai, X Li… - ICLR 2024 Workshop …, 2024 - openreview.net

The development of autonomous driving technology depends on merging perception,
decision, and control systems. Traditional strategies have struggled to understand complex …

被引用次数：4 相关文章

[PDF] arxiv.org

Talk2BEV: Language-enhanced Bird's-eye View Maps for Autonomous Driving

V Dewangan, T Choudhary, S Chandhok… - arXiv preprint arXiv …, 2023 - arxiv.org

Talk2BEV is a large vision-language model (LVLM) interface for bird's-eye view (BEV) maps
in autonomous driving contexts. While existing perception systems for autonomous driving …

被引用次数：25 相关文章

[PDF] arxiv.org

GPT-4V as Traffic Assistant: An In-depth Look at Vision Language Model on Complex Traffic Events

X Zhou, AC Knoll - arXiv preprint arXiv:2402.02205, 2024 - arxiv.org

The recognition and understanding of traffic incidents, particularly traffic accidents, is a topic
of paramount importance in the realm of intelligent transportation systems and intelligent …

被引用次数：2 相关文章所有 2 个版本

Talk2BEV: Language-enhanced Bird's-eye View Maps for Autonomous Driving

T Choudhary, V Dewangan, S Chandhok… - arXiv e …, 2023 - ui.adsabs.harvard.edu

Talk2BEV is a large vision-language model (LVLM) interface for bird's-eye view (BEV) maps
in autonomous driving contexts. While existing perception systems for autonomous driving …

被引用次数：1 相关文章

[PDF] arxiv.org

Drivevlm: The convergence of autonomous driving and large vision-language models

X Tian, J Gu, B Li, Y Liu, C Hu, Y Wang, K Zhan… - arXiv preprint arXiv …, 2024 - arxiv.org

A primary hurdle of autonomous driving in urban environments is understanding complex
and long-tail scenarios, such as challenging road conditions and delicate human behaviors …

被引用次数：16 相关文章所有 2 个版本

[PDF] ustc.edu.cn

[PDF][PDF] Reason2drive: Towards interpretable and chain-based reasoning for autonomous driving

M Nie, R Peng, C Wang, X Cai, J Han… - arXiv preprint arXiv …, 2023 - s4plus.ustc.edu.cn

Large vision-language models (VLMs) have garnered increasing interest in autonomous
driving areas, due to their advanced capabilities in complex reasoning tasks essential for …

被引用次数：9 相关文章所有 3 个版本

[PDF] arxiv.org

Multi-Frame, Lightweight & Efficient Vision-Language Models for Question Answering in Autonomous Driving

A Gopalkrishnan, R Greer, M Trivedi - arXiv preprint arXiv:2403.19838, 2024 - arxiv.org

Vision-Language Models (VLMs) and Multi-Modal Language models (MMLMs) have
become prominent in autonomous driving research, as these models can provide …

被引用次数：1 相关文章所有 3 个版本

[PDF] arxiv.org

Dolphins: Multimodal language model for driving

Y Ma, Y Cao, J Sun, M Pavone, C Xiao - arXiv preprint arXiv:2312.00438, 2023 - arxiv.org

The quest for fully autonomous vehicles (AVs) capable of navigating complex real-world
scenarios with human-like understanding and responsiveness. In this paper, we introduce …

被引用次数：16 相关文章所有 2 个版本

高级搜索

QQ 群

GPT-4V Explorations: Mining Autonomous Driving

On the road with gpt-4v (ision): Early explorations of visual-language model on autonomous driving

On the Road with GPT-4V (ision): Explorations of Utilizing Visual-Language Model as Autonomous Driving Agent

Talk2BEV: Language-enhanced Bird's-eye View Maps for Autonomous Driving

GPT-4V as Traffic Assistant: An In-depth Look at Vision Language Model on Complex Traffic Events

Talk2BEV: Language-enhanced Bird's-eye View Maps for Autonomous Driving

Drivevlm: The convergence of autonomous driving and large vision-language models

[PDF][PDF] Reason2drive: Towards interpretable and chain-based reasoning for autonomous driving

Multi-Frame, Lightweight & Efficient Vision-Language Models for Question Answering in Autonomous Driving

Dolphins: Multimodal language model for driving

引用