Advancing referring expression segmentation beyond single image- 学术资源搜索

Advancing referring expression segmentation beyond single image

Y Wu, Z Zhang, C Xie, F Zhu… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Proceedings of the IEEE/CVF International Conference on …, 2023•openaccess.thecvf.com

Abstract Referring Expression Segmentation (RES) is a widely explored multi-modal task,
which endeavors to segment the pre-existing object within a single image with a given
linguistic expression. However, in broader real-world scenarios, it is not always possible to
determine if the described object exists in a specific image. Generally, a collection of images
is available, some of which potentially contain the target objects. To this end, we propose a
more realistic setting, named Group-wise Referring Expression Segmentation (GRES) …

Abstract

Referring Expression Segmentation (RES) is a widely explored multi-modal task, which endeavors to segment the pre-existing object within a single image with a given linguistic expression. However, in broader real-world scenarios, it is not always possible to determine if the described object exists in a specific image. Generally, a collection of images is available, some of which potentially contain the target objects. To this end, we propose a more realistic setting, named Group-wise Referring Expression Segmentation (GRES), which expands RES to a group of related images, allowing the described objects to exist in a subset of the input image group. To support this new setting, we introduce an elaborately compiled dataset named Grouped Referring Dataset (GRD), containing complete group-wise annotations of the target objects described by given expressions. Moreover, we also present a baseline method named Grouped Referring Segmenter (GRSer), which explicitly captures the language-vision and intra-group vision-vision interactions to achieve state-of-the-art results on the proposed GRES setting and related tasks, such as Co-Salient Object Detection and traditional RES. Our dataset and codes are publicly released in https://github. com/shikras/d-cube.

openaccess.thecvf.com

展开收起

被引用次数：13 相关文章所有 6 个版本

以上显示的是最相近的搜索结果。查看全部搜索结果

高级搜索

QQ 群

Advancing referring expression segmentation beyond single image

引用