Z Shao, Z Yu, J Yu, X Ouyang, L Zheng, Z Gai… - arXiv preprint arXiv …, 2024 - arxiv.org
By harnessing the capabilities of large language models (LLMs), recent large multimodal
models (LMMs) have shown remarkable versatility in open-world multimodal understanding …