K Yang, J Deng, X An, J Li, Z Feng… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract Contrastive Language-Image Pre-training (CLIP) has significantly boosted the performance of various vision-language tasks by scaling up the dataset with image-text pairs …
M El Banani, K Desai… - Proceedings of the ieee …, 2023 - openaccess.thecvf.com
Although an object may appear in numerous contexts, we often describe it in a limited number of ways. Language allows us to abstract away visual variation to represent and …
K Zheng, Y Zhang, W Wu, F Lu, S Ma, X Jin… - … on Computer Vision, 2025 - Springer
Abstract Language-image pre-training largely relies on how precisely and thoroughly a text describes its paired image. In practice, however, the contents of an image can be so rich that …
While existing backdoor attacks have successfully infected multimodal contrastive learning models such as CLIP they can be easily countered by specialized backdoor defenses for …
In the field of medical Vision-Language Pretraining (VLP), significant efforts have been devoted to deriving text and image features from both clinical reports and associated …
H Liu, K Son, J Yang, C Liu, J Gao… - Proceedings of the …, 2023 - openaccess.thecvf.com
Image-text contrastive learning models such as CLIP have demonstrated strong task transfer ability. The high generality and usability of these visual models is achieved via a web-scale …
B Kim, Y Jo, J Kim, S Kim - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
Abstract Contrastive Language-Image Pretraining has emerged as a prominent approach for training vision and text encoders with uncurated image-text pairs from the web. To enhance …
In the era of big data and Artificial Intelligence, an emerging paradigm is to utilize contrastive self-supervised learning to model large-scale heterogeneous data. Many existing foundation …
Contrastive language-image pre-training (CLIP) serves as a de-facto standard to align images and texts. Nonetheless, the loose correlation between images and texts of web …