Referring Expression Comprehension (REC) aims to locate a specific object within an image by interpreting a referring expression articulated in natural language. This task comprises …
The task of multimodal referring expression comprehension (REC), aiming at localizing an image region described by a natural language expression, has recently received increasing …
Q Li, Y Zhang, S Sun, J Wu, X Zhao, M Tan - Neurocomputing, 2022 - Elsevier
Referring expression comprehension and segmentation aim to locate and segment a referred instance in an image according to a natural language expression. However …
Y Wang, Z Ji, D Wang, Y Pang, X Li - Knowledge-Based Systems, 2024 - Elsevier
Abstract Referring Expression Comprehension (REC) is a task that involves grounding a specific object in an image based on a given referring query in the form of bounding boxes …
Referring Expression Comprehension (REC) aims to locate the target object in the image according to a referring expression. This is a challenging task owing to the need for …
L Li, Y Bu, Y Cai - Proceedings of the 29th ACM International …, 2021 - dl.acm.org
In this paper, we propose a one-stage approach to improve referring expression comprehension (REC) which aims at grounding the referent according to a natural language …
M Lu, R Li, F Feng, Z Ma, X Wang - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Referring Expression Comprehension (REC) is a fundamental task in the vision and language domain, which aims to locate an image region according to a natural language …
Y Qiao, C Deng, Q Wu - IEEE Transactions on Multimedia, 2020 - ieeexplore.ieee.org
Referring expression comprehension (REC) aims to localize a target object in an image described by a referring expression phrased in natural language. Different from the object …
J Ye, J Tian, M Yan, H Xu, Q Ye, Y Shi, X Yang… - ACM Transactions on … - dl.acm.org
Referring expression comprehension aims to align natural language queries with visual scenes, which requires establishing fine-grained correspondence between vision and …