- 学术资源搜索

Rlhf-v: Towards trustworthy mllms via behavior alignment from fine-grained correctional human feedback

T Yu, Y Yao, H Zhang, T He, Y Han… - Proceedings of the …, 2024 - openaccess.thecvf.com

Abstract Multimodal Large Language Models (MLLMs) have recently demonstrated
impressive capabilities in multimodal understanding reasoning and interaction. However …

被引用次数：26 相关文章所有 2 个版本

[PDF] arxiv.org

Receive, reason, and react: Drive as you say, with large language models in autonomous vehicles

C Cui, Y Ma, X Cao, W Ye… - IEEE Intelligent …, 2024 - ieeexplore.ieee.org

The fusion of human-centric design and artificial intelligence capabilities has opened up
new possibilities for next-generation autonomous vehicles that go beyond traditional …

被引用次数：24 相关文章所有 2 个版本

[PDF] thecvf.com

Gpt as psychologist? preliminary evaluations for gpt-4v on visual affective computing

H Lu, X Niu, J Wang, Y Wang, Q Hu… - Proceedings of the …, 2024 - openaccess.thecvf.com

Multimodal large language models (MLLMs) are designed to process and integrate
information from multiple sources such as text speech images and videos. Despite its …

被引用次数：2 相关文章所有 4 个版本

[PDF] arxiv.org

Towards knowledge-driven autonomous driving

X Li, Y Bai, P Cai, L Wen, D Fu, B Zhang… - arXiv preprint arXiv …, 2023 - arxiv.org

This paper explores the emerging knowledge-driven autonomous driving technologies. Our
investigation highlights the limitations of current autonomous driving systems, in particular …

被引用次数：13 相关文章所有 2 个版本

[PDF] thecvf.com

Cityllava: Efficient fine-tuning for vlms in city scenario

Z Duan, H Cheng, D Xu, X Wu… - Proceedings of the …, 2024 - openaccess.thecvf.com

In the vast and dynamic landscape of urban settings Traffic Safety Description and Analysis
plays a pivotal role in applications ranging from insurance inspection to accident prevention …

被引用次数：1 相关文章所有 3 个版本

[PDF] thecvf.com

MAPLM: A Real-World Large-Scale Vision-Language Benchmark for Map and Traffic Scene Understanding

X Cao, T Zhou, Y Ma, W Ye, C Cui… - Proceedings of the …, 2024 - openaccess.thecvf.com

Vision-language generative AI has demonstrated remarkable promise for empowering cross-
modal scene understanding of autonomous driving and high-definition (HD) map systems …

[PDF] arxiv.org

Automated Evaluation of Large Vision-Language Models on Self-driving Corner Cases

Y Li, W Zhang, K Chen, Y Liu, P Li, R Gao… - arXiv preprint arXiv …, 2024 - arxiv.org

Large Vision-Language Models (LVLMs), due to the remarkable visual reasoning ability to
understand images and videos, have received widespread attention in the autonomous …

被引用次数：2 相关文章所有 2 个版本

[PDF] thecvf.com

Diffusion-ES: Gradient-free Planning with Diffusion for Autonomous and Instruction-guided Driving

B Yang, H Su, N Gkanatsios, TW Ke… - Proceedings of the …, 2024 - openaccess.thecvf.com

Diffusion models excel at modeling complex and multimodal trajectory distributions for
decision-making and control. Reward-gradient guided denoising has been recently …

PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs

S Nasiriany, F Xia, W Yu, T Xiao, J Liang… - arXiv preprint arXiv …, 2024 - arxiv.org

Vision language models (VLMs) have shown impressive capabilities across a variety of
tasks, from logical reasoning to visual understanding. This opens the door to richer …

被引用次数：12 相关文章所有 2 个版本

[PDF] arxiv.org

Integration of Mixture of Experts and Multimodal Generative AI in Internet of Vehicles: A Survey

M Xu, D Niyato, J Kang, Z Xiong, A Jamalipour… - arXiv preprint arXiv …, 2024 - arxiv.org

Generative AI (GAI) can enhance the cognitive, reasoning, and planning capabilities of
intelligent modules in the Internet of Vehicles (IoV) by synthesizing augmented datasets …

被引用次数：1 相关文章所有 2 个版本

高级搜索

QQ 群

Rlhf-v: Towards trustworthy mllms via behavior alignment from fine-grained correctional human feedback

Receive, reason, and react: Drive as you say, with large language models in autonomous vehicles

Gpt as psychologist? preliminary evaluations for gpt-4v on visual affective computing

Towards knowledge-driven autonomous driving

Cityllava: Efficient fine-tuning for vlms in city scenario

MAPLM: A Real-World Large-Scale Vision-Language Benchmark for Map and Traffic Scene Understanding

Automated Evaluation of Large Vision-Language Models on Self-driving Corner Cases

Diffusion-ES: Gradient-free Planning with Diffusion for Autonomous and Instruction-guided Driving

PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs

Integration of Mixture of Experts and Multimodal Generative AI in Internet of Vehicles: A Survey

引用