Learning to agree on vision attention for visual commonsense reasoning

Z Li, Y Guo, K Wang, F Liu, L Nie… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Visual Commonsense Reasoning (VCR) remains a significant yet challenging research
problem in the realm of visual reasoning. A VCR model generally aims at answering a …

Joint answering and explanation for visual commonsense reasoning

Z Li, Y Guo, K Wang, Y Wei, L Nie… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Visual Commonsense Reasoning (VCR), deemed as one challenging extension of Visual
Question Answering (VQA), endeavors to pursue a higher-level visual comprehension. VCR …

Multi-level counterfactual contrast for visual commonsense reasoning

X Zhang, F Zhang, C Xu - Proceedings of the 29th ACM International …, 2021 - dl.acm.org
Given a question about an image, a Visual Commonsense Reasoning (VCR) model needs
to provide not only a correct answer, but also a rationale to justify the answer. It is a …

Cognitive visual commonsense reasoning using dynamic working memory

X Tang, X Huang, W Zhang, TB Child, Q Hu… - Big Data Analytics and …, 2021 - Springer
Abstract Visual Commonsense Reasoning (VCR) predicts an answer with corresponding
rationale, given a question-image input. VCR is a recently introduced visual scene …

Explicit cross-modal representation learning for visual commonsense reasoning

X Zhang, F Zhang, C Xu - IEEE Transactions on Multimedia, 2021 - ieeexplore.ieee.org
Given a question about an image, Visual Commonsense Reasoning (VCR) needs to provide
not only a correct answer, but also a rationale to justify the answer. VCR is a challenging …

Multi-level knowledge injecting for visual commonsense reasoning

Z Wen, Y Peng - IEEE Transactions on Circuits and Systems for …, 2020 - ieeexplore.ieee.org
When glancing at an image, human can infer what is hidden in the image beyond what is
visually obvious, such as objects' functions, people's intents and mental states. However …

Transformation driven visual reasoning

X Hong, Y Lan, L Pang, J Guo… - Proceedings of the …, 2021 - openaccess.thecvf.com
This paper defines a new visual reasoning paradigm by introducing an important factor, ie
transformation. The motivation comes from the fact that most existing visual reasoning tasks …

SGEITL: Scene graph enhanced image-text learning for visual commonsense reasoning

Z Wang, H You, LH Li, A Zareian, S Park… - Proceedings of the …, 2022 - ojs.aaai.org
Answering complex questions about images is an ambitious goal for machine intelligence,
which requires a joint understanding of images, text, and commonsense knowledge, as well …

Efficient and self-adaptive rationale knowledge base for visual commonsense reasoning

Z Song, Z Hu, R Hong - Multimedia Systems, 2023 - Springer
Visual commonsense reasoning (VCR) task leads to a cognitive level of understanding
between vision and linguistic domains. Three sub-tasks, ie, Q → AQ→ A, QA → R QA→ R …

Multi-modal structure-embedding graph transformer for visual commonsense reasoning

J Zhu, H Wang, B He - IEEE Transactions on Multimedia, 2023 - ieeexplore.ieee.org
Visual commonsense reasoning (VCR) is a challenging reasoning task that aims to not only
answer the question based on a given image but also provide a rationale justifying for the …