Y Liu, Z Zhao,
Z Zhuang, L Tian, X Zhou… - arXiv preprint arXiv …, 2024 - arxiv.org
In recent years, vision-language models have made significant strides, excelling in tasks like
optical character recognition and geometric problem-solving. However, several critical …