MLLMGuard: A Multi-dimensional Safety Evaluation Suite for Multimodal Large Language Models

H Duan, J Yang, Y Qiao, X Fang, L Chen, Y Liu… - Proceedings of the …, 2024 - dl.acm.org

We present VLMEvalKit: an open-source toolkit for evaluating large multi-modality models
based on PyTorch. The toolkit aims to provide a user-friendly and comprehensive framework …

被引用次数：33 相关文章所有 5 个版本

[PDF] arxiv.org

Vlsbench: Unveiling visual leakage in multimodal safety

X Hu, D Liu, H Li, X Huang, J Shao - arXiv preprint arXiv:2411.19939, 2024 - arxiv.org

Safety concerns of Multimodal large language models (MLLMs) have gradually become an
important problem in various applications. Surprisingly, previous works indicate a counter …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

A survey on multimodal benchmarks: In the era of large ai models

L Li, G Chen, H Shi, J Xiao, L Chen - arXiv preprint arXiv:2409.18142, 2024 - arxiv.org

The rapid evolution of Multimodal Large Language Models (MLLMs) has brought substantial
advancements in artificial intelligence, significantly enhancing the capability to understand …

被引用次数：3 相关文章所有 2 个版本

[PDF] researchsquare.com

A Comprehensive Survey of Multimodal Large Language Models: Concept, Application and Safety

S Liu, W Pu, C Xu, Z Huang, Q Li, H Wang, C Lin… - 2024 - researchsquare.com

Recent advancements in MLLM, such as those exemplified by developments like GPT-4o,
have positioned them as a significant focus within the research community. MLLMs leverage …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

HumanEval-V: Evaluating Visual Understanding and Reasoning Abilities of Large Multimodal Models Through Coding Tasks

F Zhang, L Wu, H Bai, G Lin, X Li, X Yu, Y Wang… - arXiv preprint arXiv …, 2024 - arxiv.org

Coding tasks have been valuable for evaluating Large Language Models (LLMs), as they
demand the comprehension of high-level instructions, complex reasoning, and the …

被引用次数：1 相关文章所有 3 个版本

[PDF] arxiv.org

Audio Is the Achilles' Heel: Red Teaming Audio Large Multimodal Models

H Yang, L Qu, E Shareghi, G Haffari - arXiv preprint arXiv:2410.23861, 2024 - arxiv.org

Large Multimodal Models (LMMs) have demonstrated the ability to interact with humans
under real-world conditions by combining Large Language Models (LLMs) and modality …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Multi-PA: A Multi-perspective Benchmark on Privacy Assessment for Large Vision-Language Models

J Zhang, X Cao, Z Han, S Shan, X Chen - arXiv preprint arXiv:2412.19496, 2024 - arxiv.org

Large Vision-Language Models (LVLMs) exhibit impressive potential across various tasks
but also face significant privacy risks, limiting their practical applications. Current researches …

高级搜索

QQ 群