查看文章

thecvf.com 中的 [PDF]

Attention-Translation-Relation Network for Scalable Scene Graph Generation

作者

Nikolaos Gkanatsios, Vassilis Pitsikalis, Petros Koutras, Petros Maragos

发表日期

2019

研讨会论文

Proceedings of the IEEE International Conference on Computer Vision Workshops

页码范围

0-0

简介

We find that most Scene Graph Generation approaches suffer from two limitations as they: 1) use generic attention mechanisms and dataset-specific statistics that supersede visual features and 2) treat" no interaction" as an extra, both noisy and dominant, class and prune graph edges manually or applying simple filters. As a result, such approaches do not scale up on different settings and specifications. We propose a three-stage pipeline that employs Multi-Head Attention driven by language and spatial features, Translation Embeddings and Multi-Tasking to detect an interacting pair of objects. Our attentional scheme is able to maximize the visual features' interpretability, as well as to capture the nature of datasets of different scales, while multi-tasking robustly resolves the bias of the background class. We present an experimental overview of the related literature, unveil a multitude of evaluation inconsistencies and provide quantitative and qualitative support with experiments on a variety of datasets, where our approach performs on par or even outperforms current state-of-the-art.

引用总数

被引用次数：33

202020212022202320245 3 9 12 3

学术搜索中的文章

Attention-translation-relation network for scalable scene graph generation

N Gkanatsios, V Pitsikalis, P Koutras, P Maragos - Proceedings of the IEEE/CVF international conference …, 2019

被引用次数：33 相关文章所有 4 个版本