E Ali, S Silva,
MH Khan - arXiv preprint arXiv:2408.08855, 2024 - arxiv.org
Vision-language models (VLMs), eg, CLIP, have shown remarkable potential in zero-shot
image classification. However, adapting these models to new domains remains challenging …