Y Yang, X Zhang, J Xu,
W Han - ICASSP 2024-2024 IEEE …, 2024 - ieeexplore.ieee.org
Vision-language models (VLM) have shown excellent performance in vision-language tasks.
However, they sometimes lack sufficient reasoning ability. In contrast, large language …