Rotated multi-scale interaction network for referring remote sensing image segmentation

S Liu, Y Ma, X Zhang, H Wang, J Ji… - Proceedings of the …, 2024 - openaccess.thecvf.com
Abstract Referring Remote Sensing Image Segmentation (RRSIS) is a new challenge that
combines computer vision and natural language processing. Traditional Referring Image …

Improving Panoptic Narrative Grounding by Harnessing Semantic Relationships and Visual Confirmation

T Guo, H Wang, Y Ma, J Ji, X Sun - … of the AAAI Conference on Artificial …, 2024 - ojs.aaai.org
Recent advancements in single-stage Panoptic Narrative Grounding (PNG) have
demonstrated significant potential. These methods predict pixel-level masks by directly …

X-RefSeg3D: Enhancing Referring 3D Instance Segmentation via Structured Cross-Modal Graph Neural Networks

Z Qian, Y Ma, J Ji, X Sun - Proceedings of the AAAI Conference on …, 2024 - ojs.aaai.org
Referring 3D instance segmentation is a challenging task aimed at accurately segmenting a
target instance within a 3D scene based on a given referring expression. However, previous …

SAM as the Guide: Mastering Pseudo-Label Refinement in Semi-Supervised Referring Expression Segmentation

D Yang, J Ji, Y Ma, T Guo, H Wang, X Sun… - arXiv preprint arXiv …, 2024 - arxiv.org
In this paper, we introduce SemiRES, a semi-supervised framework that effectively
leverages a combination of labeled and unlabeled data to perform RES. A significant hurdle …

Exploring Phrase-Level Grounding with Text-to-Image Diffusion Model

D Yang, R Dong, J Ji, Y Ma, H Wang, X Sun… - arXiv preprint arXiv …, 2024 - arxiv.org
Recently, diffusion models have increasingly demonstrated their capabilities in vision
understanding. By leveraging prompt-based learning to construct sentences, these models …

Image Captioning via Dynamic Path Customization

Y Ma, J Ji, X Sun, Y Zhou, X Hong, Y Wu… - arXiv preprint arXiv …, 2024 - arxiv.org
This paper explores a novel dynamic network for vision and language tasks, where the
inferring structure is customized on the fly for different inputs. Most previous state-of-the-art …

Multi-branch Collaborative Learning Network for 3D Visual Grounding

Z Qian, Y Ma, Z Lin, J Ji, X Zheng, X Sun… - arXiv preprint arXiv …, 2024 - Springer
3D referring expression comprehension (3DREC) and segmentation (3DRES) have
overlapping objectives, indicating their potential for collaboration. However, existing …

Hierarchical Activation Dual Backbone Network for Weakly Supervised Semantic Segmentation

C Zhang, L Zhang - IEEE Sensors Journal, 2024 - ieeexplore.ieee.org
Weakly Supervised Semantic Segmentation (WSSS) aims to achieve segmentation using
weak labels, thereby reducing annotation costs. Current mainstream WSSS methods utilize …