Promptstyler: Prompt-driven style generation for source-free domain generalization

J Cho, G Nam, S Kim, H Yang… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
In a joint vision-language space, a text feature (eg, from" a photo of a dog") could effectively
represent its relevant image features (eg, from dog photos). Also, a recent study has
demonstrated the cross-modal transferability phenomenon of this joint space. From these
observations, we propose PromptStyler which simulates various distribution shifts in the joint
space by synthesizing diverse styles via prompts without using any images to deal with
source-free domain generalization. The proposed method learns to generate a variety of …

[PDF][PDF] PromptStyler: Prompt-driven Style Generation for Source-free Domain Generalization—Supplementary Material—

J Cho, G Nam, S Kim, H Yang, S Kwak - openaccess.thecvf.com
We choose CLIP [13] as our pre-trained vision-language model which is a large-scale model
trained with 400 million image-text pairs. Note that the proposed method is broadly
applicable to the CLIP-like vision-language models [7, 16] which also construct
hyperspherical joint vision-language spaces using contrastive learning methods. Given a
batch of image-text pairs, such models jointly train an image encoder and a text encoder
considering similarity scores obtained from image-text pairings. Joint vision-language …
以上显示的是最相近的搜索结果。 查看全部搜索结果