Zero-shot visual reasoning by vision-language models: Benchmarking and analysis

A Nagar, S Jaiswal, C Tan - 2024 International Joint …, 2024 - ieeexplore.ieee.org
Vision-language models (VLMs) have shown impressive zero-and few-shot performance on
real-world visual question answering (VQA) benchmarks, alluding to their capabilities as …