相关文章- 学术资源搜索

From gpt-4 to gemini and beyond: Assessing the landscape of mllms on generalizability, trustworthiness and causality through four modalities

C Lu, C Qian, G Zheng, H Fan, H Gao, J Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org

Multi-modal Large Language Models (MLLMs) have shown impressive abilities in
generating reasonable responses with respect to multi-modal contents. However, there is …

被引用次数：7 相关文章所有 2 个版本

[PDF] arxiv.org

AgentsCoDriver: Large Language Model Empowered Collaborative Driving with Lifelong Learning

S Hu, Z Fang, Z Fang, X Chen, Y Fang - arXiv preprint arXiv:2404.06345, 2024 - arxiv.org

Connected and autonomous driving is developing rapidly in recent years. However, current
autonomous driving systems, which are primarily based on data-driven approaches, exhibit …

被引用次数：2 相关文章所有 3 个版本

[PDF] arxiv.org

LC-LLM: Explainable Lane-Change Intention and Trajectory Predictions with Large Language Models

M Peng, X Guo, X Chen, M Zhu, K Chen… - arXiv preprint arXiv …, 2024 - arxiv.org

To ensure safe driving in dynamic environments, autonomous vehicles should possess the
capability to accurately predict the lane change intentions of surrounding vehicles in …

被引用次数：4 相关文章所有 2 个版本

[PDF] arxiv.org

Beyond task performance: Evaluating and reducing the flaws of large multimodal models with in-context learning

M Shukor, A Rame, C Dancette, M Cord - arXiv preprint arXiv:2310.00647, 2023 - arxiv.org

Following the success of Large Language Models (LLMs), Large Multimodal Models
(LMMs), such as the Flamingo model and its subsequent competitors, have started to …

被引用次数：8 相关文章所有 9 个版本

[PDF] arxiv.org

Adriver-i: A general world model for autonomous driving

F Jia, W Mao, Y Liu, Y Zhao, Y Wen, C Zhang… - arXiv preprint arXiv …, 2023 - arxiv.org

Typically, autonomous driving adopts a modular design, which divides the full stack into
perception, prediction, planning and control parts. Though interpretable, such modular …

被引用次数：14 相关文章所有 2 个版本

[PDF] arxiv.org

Ovis: Structural Embedding Alignment for Multimodal Large Language Model

S Lu, Y Li, QG Chen, Z Xu, W Luo, K Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org

Current Multimodal Large Language Models (MLLMs) typically integrate a pre-trained LLM
with another pre-trained vision transformer through a connector, such as an MLP, endowing …

被引用次数：1 相关文章所有 2 个版本

[PDF] thecvf.com

Zenseact open dataset: A large-scale and diverse multimodal dataset for autonomous driving

M Alibeigi, W Ljungbergh, A Tonderski… - Proceedings of the …, 2023 - openaccess.thecvf.com

Existing datasets for autonomous driving (AD) often lack diversity and long-range
capabilities, focusing instead on 360* perception and temporal reasoning. To address this …

被引用次数：37 相关文章所有 5 个版本

[PDF] arxiv.org

Large Language Models Powered Context-aware Motion Prediction

X Zheng, L Wu, Z Yan, Y Tang, H Zhao… - arXiv preprint arXiv …, 2024 - arxiv.org

Motion prediction is among the most fundamental tasks in autonomous driving. Traditional
methods of motion forecasting primarily encode vector information of maps and historical …

被引用次数：2 相关文章所有 2 个版本

[PDF] thecvf.com

mplug-owl2: Revolutionizing multi-modal large language model with modality collaboration

Q Ye, H Xu, J Ye, M Yan, A Hu, H Liu… - Proceedings of the …, 2024 - openaccess.thecvf.com

Abstract Multi-modal Large Language Models (MLLMs) have demonstrated impressive
instruction abilities across various open-ended tasks. However previous methods have …

被引用次数：112 相关文章所有 4 个版本

[PDF] ieee.org

Gated recurrent fusion to learn driving behavior from temporal multimodal data

A Narayanan, A Siravuru… - IEEE Robotics and …, 2020 - ieeexplore.ieee.org

The Tactical Driver Behavior modeling problem requires an understanding of driver actions
in complicated urban scenarios from rich multimodal signals including video, LiDAR and …

被引用次数：21 相关文章所有 4 个版本

高级搜索

QQ 群

From gpt-4 to gemini and beyond: Assessing the landscape of mllms on generalizability, trustworthiness and causality through four modalities

AgentsCoDriver: Large Language Model Empowered Collaborative Driving with Lifelong Learning

LC-LLM: Explainable Lane-Change Intention and Trajectory Predictions with Large Language Models

Beyond task performance: Evaluating and reducing the flaws of large multimodal models with in-context learning

Adriver-i: A general world model for autonomous driving

Ovis: Structural Embedding Alignment for Multimodal Large Language Model

Zenseact open dataset: A large-scale and diverse multimodal dataset for autonomous driving

Large Language Models Powered Context-aware Motion Prediction

mplug-owl2: Revolutionizing multi-modal large language model with modality collaboration

Gated recurrent fusion to learn driving behavior from temporal multimodal data

引用