AC Le-Ngo, T Tran, S Rana, S Gupta… - arXiv preprint arXiv …, 2020 - arxiv.org
Given an image, a back-ground knowledge, and a set of questions about an object, human learners answer the questions very consistently regardless of question forms and semantic …