S Huang, H Zhang, Y Gao, Y Hu,
Z Qin - arXiv preprint arXiv:2404.11865, 2024 - arxiv.org
Multimodal Large Language Models (MLLMs) have demonstrated profound capabilities in
understanding multimodal information, covering from Image LLMs to the more complex …