Y Gong, D Ran, J Liu, C Wang,
T Cong,
A Wang… - arXiv preprint arXiv …, 2023 - arxiv.org
Large vision-language models (VLMs) like GPT-4V represent an unprecedented revolution
in the field of artificial intelligence (AI). Compared to single-modal large language models …