- 学术资源搜索

A comprehensive survey of scene graphs: Generation and application

X Chang, P Ren, P Xu, Z Li, X Chen… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org

Scene graph is a structured representation of a scene that can clearly express the objects,
attributes, and relationships between objects in the scene. As computer vision technology …

被引用次数：284 相关文章所有 15 个版本

[PDF] sciencedirect.com

Multimodal research in vision and language: A review of current and emerging trends

S Uppal, S Bhagat, D Hazarika, N Majumder, S Poria… - Information …, 2022 - Elsevier

Deep Learning and its applications have cascaded impactful research and development
with a diverse range of modalities present in the real-world data. More recently, this has …

被引用次数：85 相关文章所有 5 个版本

[HTML] sciencedirect.com

[HTML][HTML] Cpt: Colorful prompt tuning for pre-trained vision-language models

Y Yao, A Zhang, Z Zhang, Z Liu, TS Chua, M Sun - AI Open, 2024 - Elsevier

Abstract Vision-Language Pre-training (VLP) models have shown promising capabilities in
grounding natural language in image data, facilitating a broad range of cross-modal tasks …

被引用次数：210 相关文章所有 4 个版本

[PDF] neurips.cc

Causal intervention for weakly-supervised semantic segmentation

D Zhang, H Zhang, J Tang… - Advances in Neural …, 2020 - proceedings.neurips.cc

We present a causal inference framework to improve Weakly-Supervised Semantic
Segmentation (WSSS). Specifically, we aim to generate better pixel-level pseudo-masks by …

被引用次数：420 相关文章所有 11 个版本

[PDF] arxiv.org

Panoptic scene graph generation

J Yang, YZ Ang, Z Guo, K Zhou, W Zhang… - European Conference on …, 2022 - Springer

Existing research addresses scene graph generation (SGG)—a critical technology for scene
understanding in images—from a detection perspective, ie., objects are detected using …

被引用次数：78 相关文章所有 5 个版本

[PDF] arxiv.org

Multi-modal knowledge graph construction and application: A survey

X Zhu, Z Li, X Wang, X Jiang, P Sun… - … on Knowledge and …, 2022 - ieeexplore.ieee.org

Recent years have witnessed the resurgence of knowledge engineering which is featured
by the fast growth of knowledge graphs. However, most of existing knowledge graphs are …

被引用次数：133 相关文章所有 7 个版本

[PDF] thecvf.com

Unbiased scene graph generation from biased training

K Tang, Y Niu, J Huang, J Shi… - Proceedings of the …, 2020 - openaccess.thecvf.com

Today's scene graph generation (SGG) task is still far from practical, mainly due to the
severe training bias, eg, collapsing diverse" human walk on/sit on/lay on beach" into" human …

被引用次数：699 相关文章所有 10 个版本

[PDF] thecvf.com

Bipartite graph network with adaptive message passing for unbiased scene graph generation

R Li, S Zhang, B Wan, X He - Proceedings of the IEEE/CVF …, 2021 - openaccess.thecvf.com

Scene graph generation is an important visual understanding task with a broad range of
vision applications. Despite recent tremendous progress, it remains challenging due to the …

被引用次数：202 相关文章所有 6 个版本

[PDF] thecvf.com

Auto-encoding scene graphs for image captioning

X Yang, K Tang, H Zhang, J Cai - Proceedings of the IEEE …, 2019 - openaccess.thecvf.com

Abstract We propose Scene Graph Auto-Encoder (SGAE) that incorporates the language
inductive bias into the encoder-decoder image captioning framework for more human-like …

被引用次数：807 相关文章所有 11 个版本

[PDF] thecvf.com

Mukea: Multimodal knowledge extraction and accumulation for knowledge-based visual question answering

Y Ding, J Yu, B Liu, Y Hu, M Cui… - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com

Abstract Knowledge-based visual question answering requires the ability of associating
external knowledge for open-ended cross-modal scene understanding. One limitation of …

被引用次数：87 相关文章所有 7 个版本

高级搜索

QQ 群

A comprehensive survey of scene graphs: Generation and application

Multimodal research in vision and language: A review of current and emerging trends

[HTML][HTML] Cpt: Colorful prompt tuning for pre-trained vision-language models

Causal intervention for weakly-supervised semantic segmentation

Panoptic scene graph generation

Multi-modal knowledge graph construction and application: A survey

Unbiased scene graph generation from biased training

Bipartite graph network with adaptive message passing for unbiased scene graph generation

Auto-encoding scene graphs for image captioning

Mukea: Multimodal knowledge extraction and accumulation for knowledge-based visual question answering

引用