Automatic spatially-aware fashion concept discovery

WH Cheng, S Song, CY Chen, SC Hidayati… - ACM Computing Surveys …, 2021 - dl.acm.org

Fashion is the way we present ourselves to the world and has become one of the world's
largest industries. Fashion, mainly conveyed by vision, has thus attracted much attention …

被引用次数：182 相关文章所有 6 个版本

[PDF] sciencedirect.com

Multimodal research in vision and language: A review of current and emerging trends

S Uppal, S Bhagat, D Hazarika, N Majumder, S Poria… - Information …, 2022 - Elsevier

Deep Learning and its applications have cascaded impactful research and development
with a diverse range of modalities present in the real-world data. More recently, this has …

被引用次数：94 相关文章所有 5 个版本

[PDF] thecvf.com

Zero-shot composed image retrieval with textual inversion

A Baldrati, L Agnolucci, M Bertini… - Proceedings of the …, 2023 - openaccess.thecvf.com

Abstract Composed Image Retrieval (CIR) aims to retrieve a target image based on a query
composed of a reference image and a relative caption that describes the difference between …

被引用次数：58 相关文章所有 7 个版本

[PDF] thecvf.com

Image retrieval on real-life images with pre-trained vision-and-language models

Z Liu, C Rodriguez-Opazo… - Proceedings of the …, 2021 - openaccess.thecvf.com

We extend the task of composed image retrieval, where an input query consists of an image
and short textual description of how to modify the image. Existing methods have only been …

被引用次数：151 相关文章所有 7 个版本

[PDF] thecvf.com

Fashionvlp: Vision language transformer for fashion retrieval with feedback

S Goenka, Z Zheng, A Jaiswal… - Proceedings of the …, 2022 - openaccess.thecvf.com

Fashion image retrieval based on a query pair of reference image and natural language
feedback is a challenging task that requires models to assess fashion related information …

被引用次数：81 相关文章所有 5 个版本

[PDF] thecvf.com

Viton: An image-based virtual try-on network

X Han, Z Wu, Z Wu, R Yu… - Proceedings of the IEEE …, 2018 - openaccess.thecvf.com

We present an image-based VIirtual Try-On Network (VITON) without using 3D information
in any form, which seamlessly transfers a desired clothing item onto the corresponding …

被引用次数：650 相关文章所有 10 个版本

[PDF] thecvf.com

Multimodal garment designer: Human-centric latent diffusion models for fashion image editing

A Baldrati, D Morelli, G Cartella… - Proceedings of the …, 2023 - openaccess.thecvf.com

Fashion illustration is used by designers to communicate their vision and to bring the design
idea from conceptualization to realization, showing how clothes interact with the human …

被引用次数：41 相关文章所有 8 个版本

[PDF] thecvf.com

Composing text and image for image retrieval-an empirical odyssey

N Vo, L Jiang, C Sun, K Murphy, LJ Li… - Proceedings of the …, 2019 - openaccess.thecvf.com

In this paper, we study the task of image retrieval, where the input query is specified in the
form of an image plus some text that describes desired modifications to the input image. For …

被引用次数：364 相关文章所有 9 个版本

[PDF] openreview.net

Fine-tuning multimodal llms to follow zero-shot demonstrative instructions

J Li, K Pan, Z Ge, M Gao, W Ji, W Zhang… - The Twelfth …, 2023 - openreview.net

Recent advancements in Multimodal Large Language Models (MLLMs) have been utilizing
Visual Prompt Generators (VPGs) to convert visual features into tokens that LLMs can …

被引用次数：38 相关文章所有 2 个版本

[PDF] arxiv.org

Learning fashion compatibility with bidirectional lstms

X Han, Z Wu, YG Jiang, LS Davis - Proceedings of the 25th ACM …, 2017 - dl.acm.org

The ubiquity of online fashion shopping demands effective recommendation services for
customers. In this paper, we study two types of fashion recommendation:(i) suggesting an …

被引用次数：430 相关文章所有 4 个版本

高级搜索

QQ 群