J Ye,
J Tian, M Yan, X Yang, X Wang… - Proceedings of the …, 2022 - openaccess.thecvf.com
Visual grounding focuses on establishing fine-grained alignment between vision and natural
language, which has essential applications in multimodal reasoning systems. Existing …