查看文章

arxiv.org 中的 [PDF]

Graph reasoning transformer for image parsing

作者

Dong Zhang, Jinhui Tang, Kwang-Ting Cheng

发表日期

2022/10/10

图书

Proceedings of the 30th ACM International Conference on Multimedia

页码范围

2380-2389

简介

Capturing the long-range dependencies has empirically proven to be effective on a wide range of computer vision tasks. The progressive advances on this topic have been made through the employment of the transformer framework with the help of the multi-head attention mechanism. However, the attention-based image patch interaction potentially suffers from problems of redundant interactions of intra-class patches and unoriented interactions of inter-class patches. In this paper, we propose a novel Graph Reasoning Transformer (GReaT) for image parsing to enable image patches to interact following a relation reasoning pattern. Specifically, the linearly embedded image patches are first projected into the graph space, where each node represents the implicit visual center for a cluster of image patches and each edge reflects the relation weight between two adjacent nodes. After that, global relation reasoning is …

引用总数

被引用次数：14

2022202320241 10 3

学术搜索中的文章

Graph reasoning transformer for image parsing

D Zhang, J Tang, KT Cheng - Proceedings of the 30th ACM International Conference …, 2022

被引用次数：14 相关文章所有 3 个版本