L Song, R Xue, H Wang, H Sun… - Advances in Neural …, 2024 - proceedings.neurips.cc
The contrastive vision-language pre-training, known as CLIP, demonstrates remarkable
potential in perceiving open-world visual concepts, enabling effective zero-shot image …