The topology and language of relationships in the visual genome dataset

DA Chacra, J Zelek - … of the IEEE/CVF Conference on …, 2022 - openaccess.thecvf.com
Abstract The Visual Genome Dataset is the de facto standard dataset used in Scene Graph
generation. It contains a large collection of images with corresponding object and …

Weakly supervised learning for textbook question answering

J Ma, Q Chai, J Huang, J Liu, Y You… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Textbook Question Answering (TQA) is the task of answering diagram and non-diagram
questions given large multi-modal contexts consisting of abundant text and diagrams. Deep …

Generating salient scene graphs with weak language supervision

A Benetatos, M Diomataris, V Pitsikalis… - 2023 31st European …, 2023 - ieeexplore.ieee.org
Scene Graph Generation (SGG), given an image, is the task of building directed graphs
where edges represent predicted triplets. Most SGG models struggle to identify important …

VCD: Knowledge Base Guided Visual Commonsense Discovery in Images

X Shen, Y Song, S Wu, R Xia - arXiv preprint arXiv:2402.17213, 2024 - arxiv.org
Visual commonsense contains knowledge about object properties, relationships, and
behaviors in visual data. Discovering visual commonsense can provide a more …

Boosting relationship detection in images with multi-granular self-supervised learning

X Ding, Y Pan, Y Li, T Yao, D Zeng, T Mei - ACM Transactions on …, 2023 - dl.acm.org
Visual and spatial relationship detection in images has been a fast-developing research
topic in the multimedia field, which learns to recognize the semantic/spatial interactions …

Common-Sense Bias Discovery and Mitigation for Classification Tasks

M Zhang, B Colman, A Shahriyari, G Bharaj - arXiv preprint arXiv …, 2024 - arxiv.org
Machine learning model bias can arise from dataset composition: sensitive features
correlated to the learning target disturb the model decision rule and lead to performance …

RelVAE: Generative Pretraining for few-shot Visual Relationship Detection

S Karapiperis, M Diomataris, V Pitsikalis - arXiv preprint arXiv:2311.16261, 2023 - arxiv.org
Visual relations are complex, multimodal concepts that play an important role in the way
humans perceive the world. As a result of their complexity, high-quality, diverse and large …

Real-time Semantic Healthcare System: Visual Risks Identification for Elders and Children

M Belkebir, TM Maarouk, B Nini - Informatica, 2024 - informatica.si
Deep learning and data-driven approaches are commonly used to avoid accidents involving
elders and children. However, existing models are limited by a semantic gap, hindering their …

Naive Scene Graphs: How Visual is Modern Visual Relationship Detection?

D Abou Chacra, J Zelek - 2023 20th Conference on Robots and …, 2023 - ieeexplore.ieee.org
Modern approaches to scene graph generation still struggle with their performance, with
even state of the art approaches hovering under a 15% mean recall on certain evaluation …

Modern Object and Visual Relationship Detection in Images from a Critical, Cognitive and Data Perspective

D Abou Chacra - 2023 - uwspace.uwaterloo.ca
Deep learning has dominated the landscape of computer vision for the past decade. Deep
learning networks are the top performers on a slew of computer vision challenges (eg, object …