Adversarial representation learning for text-to-image matching

S Lu, M Liu, L Yin, Z Yin, X Liu, W Zheng - PeerJ Computer Science, 2023 - peerj.com

Abstract Visual Question Answering (VQA) is a significant cross-disciplinary issue in the
fields of computer vision and natural language processing that requires a computer to output …

被引用次数：183 相关文章所有 8 个版本

[PDF] arxiv.org

Image-text retrieval: A survey on recent research and development

M Cao, S Li, J Li, L Nie, M Zhang - arXiv preprint arXiv:2203.14713, 2022 - arxiv.org

In the past few years, cross-modal image-text retrieval (ITR) has experienced increased
interest in the research community due to its excellent research value and broad real-world …

被引用次数：79 相关文章所有 4 个版本

[PDF] thecvf.com

Cross-modal implicit relation reasoning and aligning for text-to-image person retrieval

D Jiang, M Ye - Proceedings of the IEEE/CVF Conference …, 2023 - openaccess.thecvf.com

Text-to-image person retrieval aims to identify the target person based on a given textual
description query. The primary challenge is to learn the mapping of visual and textual …

被引用次数：115 相关文章所有 5 个版本

[PDF] arxiv.org

Self-supervised learning: Generative or contrastive

X Liu, F Zhang, Z Hou, L Mian, Z Wang… - IEEE transactions on …, 2021 - ieeexplore.ieee.org

Deep supervised learning has achieved great success in the last decade. However, its
defects of heavy dependence on manual labels and vulnerability to attacks have driven …

被引用次数：1782 相关文章所有 6 个版本

Dual-level representation enhancement on characteristic and context for image-text retrieval

S Yang, Q Li, W Li, X Li, AA Liu - IEEE Transactions on Circuits …, 2022 - ieeexplore.ieee.org

Image-text retrieval is a fundamental and vital task in multi-media retrieval and has received
growing attention since it connects heterogeneous data. Previous methods that perform well …

被引用次数：99 相关文章所有 2 个版本

[PDF] arxiv.org

Clip-driven fine-grained text-image person re-identification

S Yan, N Dong, L Zhang, J Tang - IEEE Transactions on Image …, 2023 - ieeexplore.ieee.org

Text-Image Person Re-identification (TIReID) aims to retrieve the image corresponding to
the given text query from a pool of candidate images. Existing methods employ prior …

被引用次数：95 相关文章所有 7 个版本

[PDF] thecvf.com

Cross-modality person re-identification with shared-specific feature transfer

Y Lu, Y Wu, B Liu, T Zhang, B Li… - Proceedings of the …, 2020 - openaccess.thecvf.com

Cross-modality person re-identification (cm-ReID) is a challenging but key technology for
intelligent video analysis. Existing works mainly focus on learning modality-shared …

被引用次数：356 相关文章所有 7 个版本

[PDF] thecvf.com

Fashionvlp: Vision language transformer for fashion retrieval with feedback

S Goenka, Z Zheng, A Jaiswal… - Proceedings of the …, 2022 - openaccess.thecvf.com

Fashion image retrieval based on a query pair of reference image and natural language
feedback is a challenging task that requires models to assess fashion related information …

被引用次数：85 相关文章所有 5 个版本

[PDF] arxiv.org

See finer, see more: Implicit modality alignment for text-based person retrieval

X Shu, W Wen, H Wu, K Chen, Y Song, R Qiao… - … on Computer Vision, 2022 - Springer

Text-based person retrieval aims to find the query person based on a textual description.
The key is to learn a common latent space mapping between visual-textual modalities. To …

被引用次数：79 相关文章所有 6 个版本

[PDF] arxiv.org

Learning granularity-unified representations for text-to-image person re-identification

Z Shao, X Zhang, M Fang, Z Lin, J Wang… - Proceedings of the 30th …, 2022 - dl.acm.org

Text-to-image person re-identification (ReID) aims to search for pedestrian images of an
interested identity via textual descriptions. It is challenging due to both rich intra-modal …

被引用次数：92 相关文章所有 4 个版本

高级搜索

QQ 群