Grounding consistency: Distilling spatial common sense for precise visual relationship detection

DA Chacra, J Zelek - … of the IEEE/CVF Conference on …, 2022 - openaccess.thecvf.com

Abstract The Visual Genome Dataset is the de facto standard dataset used in Scene Graph
generation. It contains a large collection of images with corresponding object and …

被引用次数：11 相关文章所有 4 个版本

[PDF] techrxiv.org

Weakly supervised learning for textbook question answering

J Ma, Q Chai, J Huang, J Liu, Y You… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org

Textbook Question Answering (TQA) is the task of answering diagram and non-diagram
questions given large multi-modal contexts consisting of abundant text and diagrams. Deep …

被引用次数：15 相关文章所有 7 个版本

[PDF] eurasip.org

Generating salient scene graphs with weak language supervision

A Benetatos, M Diomataris, V Pitsikalis… - 2023 31st European …, 2023 - ieeexplore.ieee.org

Scene Graph Generation (SGG), given an image, is the task of building directed graphs
where edges represent predicted triplets. Most SGG models struggle to identify important …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

VCD: Knowledge Base Guided Visual Commonsense Discovery in Images

X Shen, Y Song, S Wu, R Xia - arXiv preprint arXiv:2402.17213, 2024 - arxiv.org

Visual commonsense contains knowledge about object properties, relationships, and
behaviors in visual data. Discovering visual commonsense can provide a more …

被引用次数：3 相关文章所有 2 个版本

Boosting relationship detection in images with multi-granular self-supervised learning

X Ding, Y Pan, Y Li, T Yao, D Zeng, T Mei - ACM Transactions on …, 2023 - dl.acm.org

Visual and spatial relationship detection in images has been a fast-developing research
topic in the multimedia field, which learns to recognize the semantic/spatial interactions …

被引用次数：2 相关文章

[PDF] arxiv.org

Common-Sense Bias Discovery and Mitigation for Classification Tasks

M Zhang, B Colman, A Shahriyari, G Bharaj - arXiv preprint arXiv …, 2024 - arxiv.org

Machine learning model bias can arise from dataset composition: sensitive features
correlated to the learning target disturb the model decision rule and lead to performance …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

高级搜索

QQ 群