相关文章- 学术资源搜索

Drsl: Deep relational similarity learning for cross-modal retrieval

X Wang, P Hu, L Zhen, D Peng - Information Sciences, 2021 - Elsevier

Cross-modal retrieval aims to retrieve relevant samples across different media modalities.
Existing cross-modal retrieval approaches are contingent on learning common …

被引用次数：57 相关文章所有 2 个版本

[PDF] bjtu.edu.cn

Cross-modal retrieval with CNN visual features: A new baseline

Y Wei, Y Zhao, C Lu, S Wei, L Liu… - IEEE transactions on …, 2016 - ieeexplore.ieee.org

Recently, convolutional neural network (CNN) visual features have demonstrated their
powerful ability as a universal representation for various recognition tasks. In this paper …

被引用次数：418 相关文章所有 4 个版本

[PDF] arxiv.org

Cross-modal retrieval: a systematic review of methods and future directions

F Li, L Zhu, T Wang, J Li, Z Zhang, HT Shen - arXiv preprint arXiv …, 2023 - arxiv.org

With the exponential surge in diverse multi-modal data, traditional uni-modal retrieval
methods struggle to meet the needs of users demanding access to data from various …

被引用次数：15 相关文章所有 3 个版本

[PDF] thecvf.com

Ei-clip: Entity-aware interventional contrastive learning for e-commerce cross-modal retrieval

H Ma, H Zhao, Z Lin, A Kale, Z Wang… - Proceedings of the …, 2022 - openaccess.thecvf.com

Abstract recommendation, and marketing services. Extensive efforts have been made to
conquer the cross-modal retrieval problem in the general domain. When it comes to E …

被引用次数：53 相关文章所有 3 个版本

[PDF] thecvf.com

Cots: Collaborative two-stream vision-language pre-training model for cross-modal retrieval

H Lu, N Fei, Y Huo, Y Gao, Z Lu… - Proceedings of the …, 2022 - openaccess.thecvf.com

Large-scale single-stream pre-training has shown dramatic performance in image-text
retrieval. Regrettably, it faces low inference efficiency due to heavy attention layers …

被引用次数：71 相关文章所有 6 个版本

[PDF] thecvf.com

Deep joint-semantics reconstructing hashing for large-scale unsupervised cross-modal retrieval

S Su, Z Zhong, C Zhang - Proceedings of the IEEE/CVF …, 2019 - openaccess.thecvf.com

Cross-modal hashing encodes the multimedia data into a common binary hash space in
which the correlations among the samples from different modalities can be effectively …

被引用次数：275 相关文章所有 3 个版本

[PDF] arxiv.org

On metric learning for audio-text cross-modal retrieval

X Mei, X Liu, J Sun, MD Plumbley, W Wang - arXiv preprint arXiv …, 2022 - arxiv.org

Audio-text retrieval aims at retrieving a target audio clip or caption from a pool of candidates
given a query in another modality. Solving such cross-modal retrieval task is challenging …

被引用次数：64 相关文章所有 9 个版本

[PDF] thecvf.com

Mutual quantization for cross-modal search with noisy labels

E Yang, D Yao, T Liu, C Deng - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com

Deep cross-modal hashing has become an essential tool for supervised multimodal search.
These models tend to be optimized with large, curated multimodal datasets, where most …

被引用次数：32 相关文章所有 3 个版本

[PDF] thecvf.com

Polysemous visual-semantic embedding for cross-modal retrieval

Y Song, M Soleymani - … of the IEEE/CVF Conference on …, 2019 - openaccess.thecvf.com

Visual-semantic embedding aims to find a shared latent space where related visual and
textual instances are close to each other. Most current methods learn injective embedding …

被引用次数：301 相关文章所有 8 个版本

[PDF] thecvf.com

Vop: Text-video co-operative prompt tuning for cross-modal retrieval

S Huang, B Gong, Y Pan, J Jiang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Many recent studies leverage the pre-trained CLIP for text-video cross-modal retrieval by
tuning the backbone with additional heavy modules, which not only brings huge …

被引用次数：45 相关文章所有 7 个版本

高级搜索

QQ 群

Drsl: Deep relational similarity learning for cross-modal retrieval

Cross-modal retrieval with CNN visual features: A new baseline

Cross-modal retrieval: a systematic review of methods and future directions

Ei-clip: Entity-aware interventional contrastive learning for e-commerce cross-modal retrieval

Cots: Collaborative two-stream vision-language pre-training model for cross-modal retrieval

Deep joint-semantics reconstructing hashing for large-scale unsupervised cross-modal retrieval

On metric learning for audio-text cross-modal retrieval

Mutual quantization for cross-modal search with noisy labels

Polysemous visual-semantic embedding for cross-modal retrieval

Vop: Text-video co-operative prompt tuning for cross-modal retrieval

引用