J Park,
B Han - … IEEE/CVF Conference on
Computer Vision …, 2023 - openaccess.thecvf.com
… believed to harm vision-language models [16, 39] by breaking the semantic consistency
of the image-text pair, but we argue that this does not hold for the large-scale vision-language …