Y Tang, X Han, X Li, Q Yu, Y Hao, L Hu… - arXiv e …, 2024 - ui.adsabs.harvard.edu
Large 2D vision-language models (2D-LLMs) have gained significant attention by bridging
Large Language Models (LLMs) with images using a simple projector. Inspired by their …