Z Yuan, Z Li, W Huang, Y Ye, L Sun - 2nd Workshop on Advancing Neural … - openreview.net
In recent years, multimodal large language models (MLLMs) such as GPT-4V have
demonstrated remarkable advancements, excelling in a variety of vision-language tasks …