Y Zhang,
J Pan, Y Zhou, R Pan… - Proceedings of the 2023 …, 2023 - aclanthology.org
Abstract Vision-Language Models (VLMs) are trained on vast amounts of data captured by
humans emulating our understanding of the world. However, known as visual illusions …