C Yang, Q Jin, F Du, J Guo, Y Zhou - Complex & Intelligent Systems, 2025 - Springer
By leveraging large-scale image-text paired data for pre-training, the model can efficiently
learn the alignment between images and text, significantly advancing the development of …