Semi-supervised panoptic narrative grounding

S Liu, Y Ma, X Zhang, H Wang, J Ji… - Proceedings of the …, 2024 - openaccess.thecvf.com

Abstract Referring Remote Sensing Image Segmentation (RRSIS) is a new challenge that
combines computer vision and natural language processing. Traditional Referring Image …

被引用次数：10 相关文章所有 3 个版本

[PDF] aaai.org

Improving Panoptic Narrative Grounding by Harnessing Semantic Relationships and Visual Confirmation

T Guo, H Wang, Y Ma, J Ji, X Sun - … of the AAAI Conference on Artificial …, 2024 - ojs.aaai.org

Recent advancements in single-stage Panoptic Narrative Grounding (PNG) have
demonstrated significant potential. These methods predict pixel-level masks by directly …

被引用次数：3 相关文章

[PDF] aaai.org

X-RefSeg3D: Enhancing Referring 3D Instance Segmentation via Structured Cross-Modal Graph Neural Networks

Z Qian, Y Ma, J Ji, X Sun - Proceedings of the AAAI Conference on …, 2024 - ojs.aaai.org

Referring 3D instance segmentation is a challenging task aimed at accurately segmenting a
target instance within a 3D scene based on a given referring expression. However, previous …

被引用次数：7 相关文章

[PDF] arxiv.org

SAM as the Guide: Mastering Pseudo-Label Refinement in Semi-Supervised Referring Expression Segmentation

D Yang, J Ji, Y Ma, T Guo, H Wang, X Sun… - arXiv preprint arXiv …, 2024 - arxiv.org

In this paper, we introduce SemiRES, a semi-supervised framework that effectively
leverages a combination of labeled and unlabeled data to perform RES. A significant hurdle …

被引用次数：4 相关文章所有 3 个版本

[PDF] arxiv.org

Exploring Phrase-Level Grounding with Text-to-Image Diffusion Model

D Yang, R Dong, J Ji, Y Ma, H Wang, X Sun… - arXiv preprint arXiv …, 2024 - arxiv.org

Recently, diffusion models have increasingly demonstrated their capabilities in vision
understanding. By leveraging prompt-based learning to construct sentences, these models …

Image Captioning via Dynamic Path Customization

Y Ma, J Ji, X Sun, Y Zhou, X Hong, Y Wu… - arXiv preprint arXiv …, 2024 - arxiv.org

This paper explores a novel dynamic network for vision and language tasks, where the
inferring structure is customized on the fly for different inputs. Most previous state-of-the-art …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Multi-branch Collaborative Learning Network for 3D Visual Grounding

Z Qian, Y Ma, Z Lin, J Ji, X Zheng, X Sun… - arXiv preprint arXiv …, 2024 - Springer

3D referring expression comprehension (3DREC) and segmentation (3DRES) have
overlapping objectives, indicating their potential for collaboration. However, existing …

Hierarchical Activation Dual Backbone Network for Weakly Supervised Semantic Segmentation

C Zhang, L Zhang - IEEE Sensors Journal, 2024 - ieeexplore.ieee.org

Weakly Supervised Semantic Segmentation (WSSS) aims to achieve segmentation using
weak labels, thereby reducing annotation costs. Current mainstream WSSS methods utilize …

高级搜索

QQ 群