Fashion meets computer vision: A survey

WH Cheng, S Song, CY Chen, SC Hidayati… - ACM Computing Surveys …, 2021 - dl.acm.org
Fashion is the way we present ourselves to the world and has become one of the world's
largest industries. Fashion, mainly conveyed by vision, has thus attracted much attention …

Multimodal research in vision and language: A review of current and emerging trends

S Uppal, S Bhagat, D Hazarika, N Majumder, S Poria… - Information …, 2022 - Elsevier
Deep Learning and its applications have cascaded impactful research and development
with a diverse range of modalities present in the real-world data. More recently, this has …

Zero-shot composed image retrieval with textual inversion

A Baldrati, L Agnolucci, M Bertini… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract Composed Image Retrieval (CIR) aims to retrieve a target image based on a query
composed of a reference image and a relative caption that describes the difference between …

Image retrieval on real-life images with pre-trained vision-and-language models

Z Liu, C Rodriguez-Opazo… - Proceedings of the …, 2021 - openaccess.thecvf.com
We extend the task of composed image retrieval, where an input query consists of an image
and short textual description of how to modify the image. Existing methods have only been …

Fashionvlp: Vision language transformer for fashion retrieval with feedback

S Goenka, Z Zheng, A Jaiswal… - Proceedings of the …, 2022 - openaccess.thecvf.com
Fashion image retrieval based on a query pair of reference image and natural language
feedback is a challenging task that requires models to assess fashion related information …

Viton: An image-based virtual try-on network

X Han, Z Wu, Z Wu, R Yu… - Proceedings of the IEEE …, 2018 - openaccess.thecvf.com
We present an image-based VIirtual Try-On Network (VITON) without using 3D information
in any form, which seamlessly transfers a desired clothing item onto the corresponding …

Multimodal garment designer: Human-centric latent diffusion models for fashion image editing

A Baldrati, D Morelli, G Cartella… - Proceedings of the …, 2023 - openaccess.thecvf.com
Fashion illustration is used by designers to communicate their vision and to bring the design
idea from conceptualization to realization, showing how clothes interact with the human …

Composing text and image for image retrieval-an empirical odyssey

N Vo, L Jiang, C Sun, K Murphy, LJ Li… - Proceedings of the …, 2019 - openaccess.thecvf.com
In this paper, we study the task of image retrieval, where the input query is specified in the
form of an image plus some text that describes desired modifications to the input image. For …

Fine-tuning multimodal llms to follow zero-shot demonstrative instructions

J Li, K Pan, Z Ge, M Gao, W Ji, W Zhang… - The Twelfth …, 2023 - openreview.net
Recent advancements in Multimodal Large Language Models (MLLMs) have been utilizing
Visual Prompt Generators (VPGs) to convert visual features into tokens that LLMs can …

Learning fashion compatibility with bidirectional lstms

X Han, Z Wu, YG Jiang, LS Davis - Proceedings of the 25th ACM …, 2017 - dl.acm.org
The ubiquity of online fashion shopping demands effective recommendation services for
customers. In this paper, we study two types of fashion recommendation:(i) suggesting an …