Z Wang, Z Gao, K Guo, Y Yang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Image-text retrieval is a fundamental task to bridge vision and language by exploiting
various strategies to fine-grained alignment between regions and words. This is still tough …