We launch EVA-02, a next-generation Transformer-based visual representation pre-trained to reconstruct strong and robust language-aligned vision features via masked image …
Contrastive pre-training of image-text foundation models such as CLIP demonstrated excellent zero-shot performance and improved robustness on a wide range of downstream …
O Mengara, A Avila, TH Falk - IEEE Access, 2024 - ieeexplore.ieee.org
Deep neural network (DNN) classifiers are potent instruments that can be used in various security-sensitive applications. Nonetheless, they are vulnerable to certain attacks that …
Training large transformer models from scratch for a target task requires lots of data and is computationally demanding. The usual practice of transfer learning overcomes this …
Knowledge Distillation (KD) has proven effective for compressing large teacher models into smaller student models. While it is well known that student models can achieve similar …
Y Yang, C Lin, X Ji, Q Tian, Q Li, H Yang… - arXiv preprint arXiv …, 2023 - arxiv.org
Transfer-based adversarial attacks raise a severe threat to real-world deep learning systems since they do not require access to target models. Adversarial training (AT), which is …
Image-text contrastive learning models such as CLIP have demonstrated strong task transfer ability. The high generality and usability of these visual models is achieved via a web-scale …