related:1brF6I5DXGgJ:scholar.google.com/

Learning to agree on vision attention for visual commonsense reasoning

Z Li, Y Guo, K Wang, F Liu, L Nie… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Visual Commonsense Reasoning (VCR) remains a significant yet challenging research
problem in the realm of visual reasoning. A VCR model generally aims at answering a …

被引用次数：7 相关文章所有 3 个版本

[PDF] arxiv.org

Joint answering and explanation for visual commonsense reasoning

Z Li, Y Guo, K Wang, Y Wei, L Nie… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Visual Commonsense Reasoning (VCR), deemed as one challenging extension of Visual
Question Answering (VQA), endeavors to pursue a higher-level visual comprehension. VCR …

被引用次数：13 相关文章所有 7 个版本

Multi-level counterfactual contrast for visual commonsense reasoning

X Zhang, F Zhang, C Xu - Proceedings of the 29th ACM International …, 2021 - dl.acm.org

Given a question about an image, a Visual Commonsense Reasoning (VCR) model needs
to provide not only a correct answer, but also a rationale to justify the answer. It is a …

被引用次数：22 相关文章

[PDF] arxiv.org

Cognitive visual commonsense reasoning using dynamic working memory

X Tang, X Huang, W Zhang, TB Child, Q Hu… - Big Data Analytics and …, 2021 - Springer

Abstract Visual Commonsense Reasoning (VCR) predicts an answer with corresponding
rationale, given a question-image input. VCR is a recently introduced visual scene …

被引用次数：10 相关文章所有 7 个版本

Explicit cross-modal representation learning for visual commonsense reasoning

X Zhang, F Zhang, C Xu - IEEE Transactions on Multimedia, 2021 - ieeexplore.ieee.org

Given a question about an image, Visual Commonsense Reasoning (VCR) needs to provide
not only a correct answer, but also a rationale to justify the answer. VCR is a challenging …

被引用次数：26 相关文章所有 2 个版本

Multi-level knowledge injecting for visual commonsense reasoning

Z Wen, Y Peng - IEEE Transactions on Circuits and Systems for …, 2020 - ieeexplore.ieee.org

When glancing at an image, human can infer what is hidden in the image beyond what is
visually obvious, such as objects' functions, people's intents and mental states. However …

被引用次数：30 相关文章

[PDF] thecvf.com

Transformation driven visual reasoning

X Hong, Y Lan, L Pang, J Guo… - Proceedings of the …, 2021 - openaccess.thecvf.com

This paper defines a new visual reasoning paradigm by introducing an important factor, ie
transformation. The motivation comes from the fact that most existing visual reasoning tasks …

被引用次数：17 相关文章所有 6 个版本

[PDF] aaai.org

SGEITL: Scene graph enhanced image-text learning for visual commonsense reasoning

Z Wang, H You, LH Li, A Zareian, S Park… - Proceedings of the …, 2022 - ojs.aaai.org

Answering complex questions about images is an ambitious goal for machine intelligence,
which requires a joint understanding of images, text, and commonsense knowledge, as well …

被引用次数：20 相关文章所有 7 个版本

Efficient and self-adaptive rationale knowledge base for visual commonsense reasoning

Z Song, Z Hu, R Hong - Multimedia Systems, 2023 - Springer

Visual commonsense reasoning (VCR) task leads to a cognitive level of understanding
between vision and linguistic domains. Three sub-tasks, ie, Q → AQ→ A, QA → R QA→ R …

被引用次数：6 相关文章所有 3 个版本

Multi-modal structure-embedding graph transformer for visual commonsense reasoning

J Zhu, H Wang, B He - IEEE Transactions on Multimedia, 2023 - ieeexplore.ieee.org

Visual commonsense reasoning (VCR) is a challenging reasoning task that aims to not only
answer the question based on a given image but also provide a rationale justifying for the …

被引用次数：5 相关文章所有 2 个版本

高级搜索

QQ 群