Sgtr: End-to-end scene graph generation with transformer

P Xu, X Zhu, DA Clifton - IEEE Transactions on Pattern Analysis …, 2023 - ieeexplore.ieee.org

Transformer is a promising neural network learner, and has achieved great success in
various machine learning tasks. Thanks to the recent prevalence of multimodal applications …

被引用次数：348 相关文章所有 9 个版本

[PDF] arxiv.org

Reltr: Relation transformer for scene graph generation

Y Cong, MY Yang, B Rosenhahn - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Different objects in the same scene are more or less related to each other, but only a limited
number of these relationships are noteworthy. Inspired by Detection Transformer, which …

被引用次数：101 相关文章所有 10 个版本

[PDF] thecvf.com

Prototype-based embedding network for scene graph generation

C Zheng, X Lyu, L Gao, B Dai… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Abstract Current Scene Graph Generation (SGG) methods explore contextual information to
predict relationships among entity pairs. However, due to the diverse visual appearance of …

被引用次数：32 相关文章所有 8 个版本

[HTML] sciencedirect.com

[HTML][HTML] Scene graph generation: A comprehensive survey

H Li, G Zhu, L Zhang, Y Jiang, Y Dang, H Hou, P Shen… - Neurocomputing, 2024 - Elsevier

Deep learning techniques have led to remarkable breakthroughs in the field of object
detection and have spawned a lot of scene-understanding tasks in recent years. Scene …

被引用次数：6 相关文章所有 4 个版本

[PDF] thecvf.com

Panoptic video scene graph generation

J Yang, W Peng, X Li, Z Guo, L Chen… - Proceedings of the …, 2023 - openaccess.thecvf.com

Towards building comprehensive real-world visual perception systems, we propose and
study a new problem called panoptic scene graph generation (PVSG). PVSG is related to …

被引用次数：25 相关文章所有 5 个版本

[PDF] thecvf.com

Rlipv2: Fast scaling of relational language-image pre-training

H Yuan, S Zhang, X Wang, S Albanie… - Proceedings of the …, 2023 - openaccess.thecvf.com

Abstract Relational Language-Image Pre-training (RLIP) aims to align vision representations
with relational texts, thereby advancing the capability of relational reasoning in computer …

被引用次数：13 相关文章所有 6 个版本

[PDF] thecvf.com

Learning to generate language-supervised and open-vocabulary scene graph using pre-trained visual-semantic space

Y Zhang, Y Pan, T Yao, R Huang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Scene graph generation (SGG) aims to abstract an image into a graph structure, by
representing objects as graph nodes and their relations as labeled edges. However, two …

被引用次数：14 相关文章所有 4 个版本

[PDF] arxiv.org

Relationformer: A Unified Framework for Image-to-Graph Generation

S Shit, R Koner, B Wittmann, J Paetzold, I Ezhov… - … on Computer Vision, 2022 - Springer

A comprehensive representation of an image requires understanding objects and their
mutual relationship, especially in image-to-graph generation, eg, road network extraction …

被引用次数：39 相关文章所有 12 个版本

[PDF] thecvf.com

Unbiased scene graph generation in videos

S Nag, K Min, S Tripathi… - Proceedings of the …, 2023 - openaccess.thecvf.com

The task of dynamic scene graph generation (SGG) from videos is complicated and
challenging due to the inherent dynamics of a scene, temporal fluctuation of model …

被引用次数：14 相关文章所有 6 个版本

[PDF] neurips.cc

4d panoptic scene graph generation

J Yang, J Cen, W Peng, S Liu, F Hong… - Advances in …, 2024 - proceedings.neurips.cc

We are living in a three-dimensional space while moving forward through a fourth
dimension: time. To allow artificial intelligence to develop a comprehensive understanding …

被引用次数：3 相关文章所有 5 个版本

高级搜索

QQ 群