X Liu,
J Wu,
W Yang, X Zhou… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Pre-trained Vision-Language Models (VLMs), like CLIP, exhibit strong generalization ability
to downstream tasks but struggle in few-shot scenarios. Existing prompting techniques …