Cross-lingual cross-modal pretraining for multimodal retrieval

Y Fan, X Xie, Y Cai, J Chen, X Ma, X Li… - … and Trends® in …, 2022 - nowpublishers.com

The core of information retrieval (IR) is to identify relevant information from large-scale
resources and return it as a ranked list to respond to user's information need. In recent years …

被引用次数：83 相关文章所有 8 个版本

[PDF] aclanthology.org

Cross-lingual and multilingual clip

F Carlsson, P Eisen, F Rekathati… - Proceedings of the …, 2022 - aclanthology.org

The long-standing endeavor of relating the textual and the visual domain recently underwent
a pivotal breakthrough, as OpenAI released CLIP. This model distinguishes how well an …

被引用次数：68 相关文章所有 3 个版本

[PDF] arxiv.org

Vision-and-language pretrained models: A survey

S Long, F Cao, SC Han, H Yang - arXiv preprint arXiv:2204.07356, 2022 - arxiv.org

Pretrained models have produced great success in both Computer Vision (CV) and Natural
Language Processing (NLP). This progress leads to learning joint representations of vision …

被引用次数：54 相关文章所有 8 个版本

What is a multi-modal knowledge graph: a survey

J Peng, X Hu, W Huang, J Yang - Big Data Research, 2023 - Elsevier

With the explosive growth of multi-modal information on the Internet, the multi-modal
knowledge graph (MMKG) has become an important research topic in knowledge graphs to …

被引用次数：12 相关文章所有 3 个版本

Heterogeneous attention network for effective and efficient cross-modal retrieval

T Yu, Y Yang, Y Li, L Liu, H Fei, P Li - Proceedings of the 44th …, 2021 - dl.acm.org

Traditionally, the task of cross-modal retrieval is tackled through joint embedding. However,
the global matching used in joint embedding methods often fails to effectively describe …

被引用次数：51 相关文章所有 2 个版本

[PDF] arxiv.org

Cross-lingual cross-modal retrieval with noise-robust learning

Y Wang, J Dong, T Liang, M Zhang, R Cai… - Proceedings of the 30th …, 2022 - dl.acm.org

Despite the recent developments in the field of cross-modal retrieval, there has been less
research focusing on low-resource languages due to the lack of manually annotated …

被引用次数：18 相关文章所有 3 个版本

[PDF] aclanthology.org

Cross-lingual cross-modal consolidation for effective multilingual video corpus moment retrieval

J Liu, T Yu, H Peng, M Sun, P Li - Findings of the Association for …, 2022 - aclanthology.org

Existing multilingual video corpus moment retrieval (mVCMR) methods are mainly based on
a two-stream structure. The visual stream utilizes the visual content in the video to estimate …

被引用次数：19 相关文章所有 3 个版本

[PDF] arxiv.org

Dual-view curricular optimal transport for cross-lingual cross-modal retrieval

Y Wang, S Wang, H Luo, J Dong… - … on Image Processing, 2024 - ieeexplore.ieee.org

Current research on cross-modal retrieval is mostly English-oriented, as the availability of a
large number of English-oriented human-labeled vision-language corpora. In order to break …

被引用次数：3 相关文章所有 7 个版本

Agree: Aligning cross-modal entities for image-text retrieval upon vision-language pre-trained models

X Wang, L Li, Z Li, X Wang, X Zhu, C Wang… - Proceedings of the …, 2023 - dl.acm.org

Image-text retrieval is a challenging cross-modal task that arouses much attention. While the
traditional methods cannot break down the barriers between different modalities, Vision …

被引用次数：8 相关文章

[PDF] aclanthology.org

Inflate and shrink: Enriching and reducing interactions for fast text-image retrieval

H Liu, T Yu, P Li - Proceedings of the 2021 Conference on …, 2021 - aclanthology.org

By exploiting the cross-modal attention, cross-BERT methods have achieved state-of-the-art
accuracy in cross-modal retrieval. Nevertheless, the heavy text-image interactions in the …

被引用次数：17 相关文章所有 3 个版本

高级搜索

QQ 群