Fine-tuning vision-language models (VLMs) like CLIP to downstream tasks is often necessary to optimize their performance. However, a major obstacle is the limited availability …
J Abdul Samadh, MH Gani, N Hussein… - Advances in …, 2024 - proceedings.neurips.cc
The promising zero-shot generalization of vision-language models such as CLIP has led to their adoption using prompt learning for numerous downstream tasks. Previous works have …
D Adila, C Shin, L Cai, F Sala - The Twelfth International …, 2024 - openreview.net
Zero-shot inference is a powerful paradigm that enables the use of large pretrained models for downstream classification tasks without further training. However, these models are …
J Hassan, H Gani, N Hussein, MU Khattak… - arXiv preprint arXiv …, 2023 - arxiv.org
The promising zero-shot generalization of vision-language models such as CLIP has led to their adoption using prompt learning for numerous downstream tasks. Previous works have …
Generative modeling has been the dominant approach for large-scale pretraining and zero- shot generalization. In this work, we challenge this convention by showing that …
Zero-shot inference is a powerful paradigm that enables the use of large pretrained models for downstream classification tasks without further training. However, these models are …
Y Shu, X Guo, J Wu, X Wang… - … on Machine Learning, 2023 - proceedings.mlr.press
Abstract Out-of-distribution (OOD) generalization, where the model needs to handle distribution shifts from training, is a major challenge of machine learning. Contrastive …
The conventional recipe for maximizing model accuracy is to (1) train multiple models with various hyperparameters and (2) pick the individual model which performs best on a held …
Semantic-descriptor-based Generalized Zero-Shot Learning (GZSL) poses challenges in recognizing novel classes in the test phase. The development of generative models enables …