相关文章- 学术资源搜索

[PDF][PDF] Drive like a human: Rethinking autonomous driving with large language models

D Fu, X Li, L Wen, M Dou, P Cai… - Proceedings of the …, 2024 - openaccess.thecvf.com

In this paper, we explore the potential of using a large language model (LLM) to understand
the driving environment in a human-like manner and analyze its ability to reason, interpret …

被引用次数：71 相关文章所有 6 个版本

[PDF] thecvf.com

Drive as you speak: Enabling human-like interaction with large language models in autonomous vehicles

C Cui, Y Ma, X Cao, W Ye… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

The future of autonomous vehicles lies in the convergence of human-centric design and
advanced AI capabilities. Autonomous vehicles of the future will not only transport …

被引用次数：41 相关文章所有 7 个版本

[PDF] thecvf.com

Multi-modal fusion transformer for end-to-end autonomous driving

A Prakash, K Chitta, A Geiger - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com

How should representations from complementary sensors be integrated for autonomous
driving? Geometry-based sensor fusion has shown great promise for perception tasks such …

被引用次数：477 相关文章所有 9 个版本

[PDF] thecvf.com

Charting new territories: Exploring the geographic and geospatial capabilities of multimodal llms

J Roberts, T Lüddecke, R Sheikh… - Proceedings of the …, 2024 - openaccess.thecvf.com

Multimodal large language models (MLLMs) have shown remarkable capabilities across a
broad range of tasks but their knowledge and abilities in the geographic and geospatial …

被引用次数：6 相关文章所有 3 个版本

[PDF] arxiv.org

Drive anywhere: Generalizable end-to-end autonomous driving with multi-modal foundation models

TH Wang, A Maalouf, W Xiao, Y Ban, A Amini… - arXiv preprint arXiv …, 2023 - arxiv.org

As autonomous driving technology matures, end-to-end methodologies have emerged as a
leading strategy, promising seamless integration from perception to control via deep …

被引用次数：6 相关文章所有 4 个版本

[PDF] arxiv.org

Imp: Highly Capable Large Multimodal Models for Mobile Devices

Z Shao, Z Yu, J Yu, X Ouyang, L Zheng, Z Gai… - arXiv preprint arXiv …, 2024 - arxiv.org

By harnessing the capabilities of large language models (LLMs), recent large multimodal
models (LMMs) have shown remarkable versatility in open-world multimodal understanding …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Next-gpt: Any-to-any multimodal llm

S Wu, H Fei, L Qu, W Ji, TS Chua - arXiv preprint arXiv:2309.05519, 2023 - arxiv.org

While recently Multimodal Large Language Models (MM-LLMs) have made exciting strides,
they mostly fall prey to the limitation of only input-side multimodal understanding, without the …

被引用次数：206 相关文章所有 4 个版本

OpenAnnotate2: Multi-Modal Auto-Annotating for Autonomous Driving

Y Zhou, L Cai, X Cheng, Q Zhang, X Xue… - IEEE Transactions …, 2024 - ieeexplore.ieee.org

The demand for high-quality annotated data has surged in recent years for applications
driven by real-world artificial intelligence, such as autonomous driving and embodied …

被引用次数：1 相关文章

[PDF] arxiv.org

Gemini in reasoning: Unveiling commonsense in multimodal large language models

Y Wang, Y Zhao - arXiv preprint arXiv:2312.17661, 2023 - arxiv.org

The burgeoning interest in Multimodal Large Language Models (MLLMs), such as OpenAI's
GPT-4V (ision), has significantly impacted both academic and industrial realms. These …

被引用次数：11 相关文章所有 2 个版本

[PDF] arxiv.org

Efficient multimodal large language models: A survey

Y Jin, J Li, Y Liu, T Gu, K Wu, Z Jiang, M He… - arXiv preprint arXiv …, 2024 - arxiv.org

In the past year, Multimodal Large Language Models (MLLMs) have demonstrated
remarkable performance in tasks such as visual question answering, visual understanding …

被引用次数：3 相关文章所有 2 个版本

高级搜索

QQ 群

[PDF][PDF] Drive like a human: Rethinking autonomous driving with large language models

Drive as you speak: Enabling human-like interaction with large language models in autonomous vehicles

Multi-modal fusion transformer for end-to-end autonomous driving

Charting new territories: Exploring the geographic and geospatial capabilities of multimodal llms

Drive anywhere: Generalizable end-to-end autonomous driving with multi-modal foundation models

Imp: Highly Capable Large Multimodal Models for Mobile Devices

Next-gpt: Any-to-any multimodal llm

OpenAnnotate2: Multi-Modal Auto-Annotating for Autonomous Driving

Gemini in reasoning: Unveiling commonsense in multimodal large language models

Efficient multimodal large language models: A survey

引用