Fine-grained Textual Inversion Network for Zero-Shot Composed Image Retrieval

H Lin, H Wen, X Song, M Liu, Y Hu, L Nie - Proceedings of the 47th …, 2024 - dl.acm.org
Composed Image Retrieval (CIR) allows users to search target images with a multimodal
query, comprising a reference image and a modification text that describes the user's …

Visual Delta Generator with Large Multi-modal Models for Semi-supervised Composed Image Retrieval

YK Jang, D Kim, Z Meng, D Huynh… - Proceedings of the …, 2024 - openaccess.thecvf.com
Abstract Composed Image Retrieval (CIR) is a task that retrieves images similar to a query
based on a provided textual modification. Current techniques rely on supervised learning for …

Simple but Effective Raw-Data Level Multimodal Fusion for Composed Image Retrieval

H Wen, X Song, X Chen, Y Wei, L Nie… - Proceedings of the 47th …, 2024 - dl.acm.org
Composed image retrieval (CIR) aims to retrieve the target image based on a multimodal
query, ie, a reference image paired with corresponding modification text. Recent CIR studies …

Pseudo-triplet Guided Few-shot Composed Image Retrieval

B Hou, H Lin, H Wen, M Liu, X Song - arXiv preprint arXiv:2407.06001, 2024 - arxiv.org
Composed Image Retrieval (CIR) is a challenging task that aims to retrieve the target image
based on a multimodal query, ie, a reference image and its corresponding modification text …

Self-Distilled Dynamic Fusion Network for Language-Based Fashion Retrieval

Y Wu, H Li, F Wang, Y Zhang… - ICASSP 2024-2024 IEEE …, 2024 - ieeexplore.ieee.org
In the domain of language-based fashion image retrieval, pinpointing the desired fashion
item using both a reference image and its accompanying textual description is an intriguing …