Fashionbert: Text and image matching with adaptive loss for cross-modal retrieval

D Gao, L Jin, B Chen, M Qiu, P Li, Y Wei, Y Hu… - Proceedings of the 43rd …, 2020 - dl.acm.org
In this paper, we address the text and image matching in cross-modal retrieval of the fashion
industry. Different from the matching in the general domain, the fashion matching is required …

Deep relation embedding for cross-modal retrieval

Y Zhang, W Zhou, M Wang, Q Tian… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org
Cross-modal retrieval aims to identify relevant data across different modalities. In this work,
we are dedicated to cross-modal retrieval between images and text sentences, which is …

Fine-grained visual textual alignment for cross-modal retrieval using transformer encoders

N Messina, G Amato, A Esuli, F Falchi… - ACM Transactions on …, 2021 - dl.acm.org
Despite the evolution of deep-learning-based visual-textual processing systems, precise
multi-modal matching remains a challenging task. In this work, we tackle the task of cross …

Image-text matching with fine-grained relational dependency and bidirectional attention-based generative networks

J Zhu, Z Li, Y Zeng, J Wei, H Ma - Proceedings of the 30th ACM …, 2022 - dl.acm.org
Generally, most existing cross-modal retrieval methods only consider global or local
semantic embeddings, lacking fine-grained dependencies between objects. At the same …

Cross-modal image-text retrieval with semantic consistency

H Chen, G Ding, Z Lin, S Zhao, J Han - Proceedings of the 27th ACM …, 2019 - dl.acm.org
Cross-modal image-text retrieval has been a long-standing challenge in the multimedia
community. Existing methods explore various complicated embedding spaces to assess the …

Preserving semantic neighborhoods for robust cross-modal retrieval

C Thomas, A Kovashka - Computer Vision–ECCV 2020: 16th European …, 2020 - Springer
The abundance of multimodal data (eg social media posts) has inspired interest in cross-
modal retrieval methods. Popular approaches rely on a variety of metric learning losses …

Imram: Iterative matching with recurrent attention memory for cross-modal image-text retrieval

H Chen, G Ding, X Liu, Z Lin, J Liu… - Proceedings of the …, 2020 - openaccess.thecvf.com
Enabling bi-directional retrieval of images and texts is important for understanding the
correspondence between vision and language. Existing methods leverage the attention …

Heterogeneous attention network for effective and efficient cross-modal retrieval

T Yu, Y Yang, Y Li, L Liu, H Fei, P Li - Proceedings of the 44th …, 2021 - dl.acm.org
Traditionally, the task of cross-modal retrieval is tackled through joint embedding. However,
the global matching used in joint embedding methods often fails to effectively describe …

Joint attribute manipulation and modality alignment learning for composing text and image to image retrieval

F Zhang, M Xu, Q Mao, C Xu - Proceedings of the 28th ACM international …, 2020 - dl.acm.org
Cross-model retrieval has attracted much attention in recent years due to its wide
applications. Conventional approaches usually take one modality as query to retrieve …

Modal-adversarial semantic learning network for extendable cross-modal retrieval

X Xu, J Song, H Lu, Y Yang, F Shen… - Proceedings of the 2018 …, 2018 - dl.acm.org
Cross-modal retrieval, eg, using an image query to search related text and vice-versa, has
become a highlighted research topic, to provide flexible retrieval experience across multi …