查看文章

thecvf.com 中的 [PDF]

Learning human-object interactions by graph parsing neural networks

作者

Siyuan Qi, Wenguan Wang, Baoxiong Jia, Jianbing Shen, Song-Chun Zhu

发表日期

2018

研讨会论文

Proceedings of the European conference on computer vision (ECCV)

页码范围

401-417

简介

This paper addresses the task of detecting and recognizing human-object interactions (HOI) in images and videos. We introduce the Graph Parsing Neural Network (GPNN), a framework that incorporates structural knowledge while being differentiable end-to-end. For a given scene, GPNN infers a parse graph that includes i) the HOI graph structure represented by an adjacency matrix, and ii) the node labels. Within a message passing inference framework, GPNN iteratively computes the adjacency matrices and node labels. We extensively evaluate our model on three HOI detection benchmarks on images and videos: HICO-DET, V-COCO, and CAD-120 datasets. Our approach significantly outperforms state-of-art methods, verifying that GPNN is scalable to large datasets and applies to spatial-temporal settings.

引用总数

被引用次数：609

20182019202020212022202320242 45 112 153 126 127 44

学术搜索中的文章

Learning human-object interactions by graph parsing neural networks

S Qi, W Wang, B Jia, J Shen, SC Zhu - Proceedings of the European conference on computer …, 2018

被引用次数：609 相关文章所有 13 个版本