Uigr: Unified interactive garment retrieval

X Han, X Zhu, L Yu, L Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com

In the fashion domain, there exists a variety of vision-and-language (V+ L) tasks, including
cross-modal retrieval, text-guided image retrieval, multi-modal classification, and image …

被引用次数：29 相关文章所有 8 个版本

[PDF] arxiv.org

Fashionvil: Fashion-focused vision-and-language representation learning

X Han, L Yu, X Zhu, L Zhang, YZ Song… - European conference on …, 2022 - Springer

Abstract Large-scale Vision-and-Language (V+ L) pre-training for representation learning
has proven to be effective in boosting various downstream V+ L tasks. However, when it …

被引用次数：50 相关文章所有 7 个版本

Fashion-GPT: Integrating LLMs with Fashion Retrieval System

Q Chen, T Zhang, M Nie, Z Wang, S Xu, W Shi… - Proceedings of the 1st …, 2023 - dl.acm.org

Customers on a fashion e-commerce platform although expressing their clothing
preferences through combined imagery and textual information, they are limited to retrieve …

被引用次数：12 相关文章

[PDF] github.io

[PDF][PDF] Benchmarking Robustness of Text-Image Composed Retrieval

S Sun, J Gu, S Gong - arXiv preprint arXiv:2311.14837, 2023 - suntongtongtong.github.io

Text-image composed retrieval aims to retrieve the target image through the composed
query, which is specified in the form of an image plus some text that describes desired …

被引用次数：1 相关文章所有 5 个版本

PCaSM: Text-guided composed image retrieval with parallel content and style modules

J Zhang, J Zhang, H Wu, Z Zhao… - 2023 IEEE International …, 2023 - ieeexplore.ieee.org

The query for text-guided image retrieval references two parts: the first part is the image, and
the second part is the text describing the part of the image that needs to be modified. By …

被引用次数：1 相关文章所有 2 个版本

[PDF] openreview.net

Simplifying Referred Visual Search with Conditional Contrastive Learning

S Lepage, J Mary, D Picard - openreview.net

This paper introduces a new challenge for image similarity search in the context of fashion,
addressing the inherent ambiguity in this domain stemming from complex images. We …

高级搜索

QQ 群