Pre-training methods in information retrieval

Y Fan, X Xie, Y Cai, J Chen, X Ma, X Li… - … and Trends® in …, 2022 - nowpublishers.com
The core of information retrieval (IR) is to identify relevant information from large-scale
resources and return it as a ranked list to respond to user's information need. In recent years …

Cross-lingual and multilingual clip

F Carlsson, P Eisen, F Rekathati… - Proceedings of the …, 2022 - aclanthology.org
The long-standing endeavor of relating the textual and the visual domain recently underwent
a pivotal breakthrough, as OpenAI released CLIP. This model distinguishes how well an …

Vision-and-language pretrained models: A survey

S Long, F Cao, SC Han, H Yang - arXiv preprint arXiv:2204.07356, 2022 - arxiv.org
Pretrained models have produced great success in both Computer Vision (CV) and Natural
Language Processing (NLP). This progress leads to learning joint representations of vision …

What is a multi-modal knowledge graph: a survey

J Peng, X Hu, W Huang, J Yang - Big Data Research, 2023 - Elsevier
With the explosive growth of multi-modal information on the Internet, the multi-modal
knowledge graph (MMKG) has become an important research topic in knowledge graphs to …

Heterogeneous attention network for effective and efficient cross-modal retrieval

T Yu, Y Yang, Y Li, L Liu, H Fei, P Li - Proceedings of the 44th …, 2021 - dl.acm.org
Traditionally, the task of cross-modal retrieval is tackled through joint embedding. However,
the global matching used in joint embedding methods often fails to effectively describe …

Cross-lingual cross-modal retrieval with noise-robust learning

Y Wang, J Dong, T Liang, M Zhang, R Cai… - Proceedings of the 30th …, 2022 - dl.acm.org
Despite the recent developments in the field of cross-modal retrieval, there has been less
research focusing on low-resource languages due to the lack of manually annotated …

Cross-lingual cross-modal consolidation for effective multilingual video corpus moment retrieval

J Liu, T Yu, H Peng, M Sun, P Li - Findings of the Association for …, 2022 - aclanthology.org
Existing multilingual video corpus moment retrieval (mVCMR) methods are mainly based on
a two-stream structure. The visual stream utilizes the visual content in the video to estimate …

Dual-view curricular optimal transport for cross-lingual cross-modal retrieval

Y Wang, S Wang, H Luo, J Dong… - … on Image Processing, 2024 - ieeexplore.ieee.org
Current research on cross-modal retrieval is mostly English-oriented, as the availability of a
large number of English-oriented human-labeled vision-language corpora. In order to break …

Agree: Aligning cross-modal entities for image-text retrieval upon vision-language pre-trained models

X Wang, L Li, Z Li, X Wang, X Zhu, C Wang… - Proceedings of the …, 2023 - dl.acm.org
Image-text retrieval is a challenging cross-modal task that arouses much attention. While the
traditional methods cannot break down the barriers between different modalities, Vision …

Inflate and shrink: Enriching and reducing interactions for fast text-image retrieval

H Liu, T Yu, P Li - Proceedings of the 2021 Conference on …, 2021 - aclanthology.org
By exploiting the cross-modal attention, cross-BERT methods have achieved state-of-the-art
accuracy in cross-modal retrieval. Nevertheless, the heavy text-image interactions in the …