Z Lv, W Wang, J Wang, S Zhang, F Wu - arXiv preprint arXiv:2501.05662, 2025 - arxiv.org
Efficient Multimodal Large Language Models (EMLLMs) have rapidly advanced recently.
Incorporating Chain-of-Thought (CoT) reasoning and step-by-step self-evaluation has …