LVLM-Intrepret: An Interpretability Tool for Large Vision-Language Models

G Ben Melech Stan, E Aflalo… - Proceedings of the …, 2024 - openaccess.thecvf.com
In the rapidly evolving landscape of artificial intelligence multi-modal large language models
are emerging as a significant area of interest. These models which combine various forms of …

LVLM-Intrepret: An Interpretability Tool for Large Vision-Language Models

GBM Stan, RY Rohekar, Y Gurwicz, ML Olson… - arXiv preprint arXiv …, 2024 - arxiv.org
In the rapidly evolving landscape of artificial intelligence, multi-modal large language
models are emerging as a significant area of interest. These models, which combine various …

Quantifying and Enabling the Interpretability of CLIP-like Models

A Madasu, Y Gandelsman, V Lal, P Howard - arXiv preprint arXiv …, 2024 - arxiv.org
CLIP is one of the most popular foundational models and is heavily used for many vision-
language tasks. However, little is known about the inner workings of CLIP. To bridge this …