Cross-modality safety alignment

S Wang, X Ye, Q Cheng, J Duan, S Li, J Fu… - arXiv preprint arXiv …, 2024 - arxiv.org
As Artificial General Intelligence (AGI) becomes increasingly integrated into various facets of
human life, ensuring the safety and ethical alignment of such systems is paramount …

Large Vision-Language Model Alignment and Misalignment: A Survey Through the Lens of Explainability

D Shu, H Zhao, J Hu, W Liu, L Cheng, M Du - arXiv preprint arXiv …, 2025 - arxiv.org
Large Vision-Language Models (LVLMs) have demonstrated remarkable capabilities in
processing both visual and textual information. However, the critical challenge of alignment …

GenderBias-\emph {VL}: Benchmarking Gender Bias in Vision Language Models via Counterfactual Probing

Y Xiao, A Liu, QJ Cheng, Z Yin, S Liang, J Li… - arXiv preprint arXiv …, 2024 - arxiv.org
Large Vision-Language Models (LVLMs) have been widely adopted in various applications;
however, they exhibit significant gender biases. Existing benchmarks primarily evaluate …

VLSBench: Unveiling Visual Leakage in Multimodal Safety

X Hu, D Liu, H Li, X Huang, J Shao - arXiv preprint arXiv:2411.19939, 2024 - arxiv.org
Safety concerns of Multimodal large language models (MLLMs) have gradually become an
important problem in various applications. Surprisingly, previous works indicate a counter …

A Comprehensive Survey of Multimodal Large Language Models: Concept, Application and Safety

S Liu, W Pu, C Xu, Z Huang, Q Li, H Wang, C Lin… - 2024 - researchsquare.com
Recent advancements in MLLM, such as those exemplified by developments like GPT-4o,
have positioned them as a significant focus within the research community. MLLMs leverage …

SPA-VL: A Comprehensive Safety Preference Alignment Dataset for Vision Language Model

Y Zhang, L Chen, G Zheng, Y Gao, R Zheng… - arXiv preprint arXiv …, 2024 - arxiv.org
The emergence of Vision Language Models (VLMs) has brought unprecedented advances
in understanding multimodal information. The combination of textual and visual semantics in …

Multi-PA: A Multi-perspective Benchmark on Privacy Assessment for Large Vision-Language Models

J Zhang, X Cao, Z Han, S Shan, X Chen - arXiv preprint arXiv:2412.19496, 2024 - arxiv.org
Large Vision-Language Models (LVLMs) exhibit impressive potential across various tasks
but also face significant privacy risks, limiting their practical applications. Current researches …

MultiSkill: Evaluating Large Multimodal Models for Fine-grained Alignment Skills

Z Xu, S Shi, B Hu, L Wang, M Zhang - Findings of the Association …, 2024 - aclanthology.org
We propose MultiSkill, an evaluation protocol that assesses large multimodal models
(LMMs) across multiple fine-grained skills for alignment with human values. Recent LMMs …

VLBiasBench: A Comprehensive Benchmark for Evaluating Bias in Large Vision-Language Model

J Zhang, S Wang, X Cao, Z Yuan, S Shan… - arXiv preprint arXiv …, 2024 - arxiv.org
The emergence of Large Vision-Language Models (LVLMs) marks significant strides
towards achieving general artificial intelligence. However, these advancements are …

The VLLM Safety Paradox: Dual Ease in Jailbreak Attack and Defense

Y Guo, F Jiao, L Nie, M Kankanhalli - arXiv preprint arXiv:2411.08410, 2024 - arxiv.org
The vulnerability of Vision Large Language Models (VLLMs) to jailbreak attacks appears as
no surprise. However, recent defense mechanisms against these attacks have reached near …