Visual instruction tuning

H Liu, C Li, Q Wu, YJ Lee - Advances in neural information …, 2024 - proceedings.neurips.cc
Instruction tuning large language models (LLMs) using machine-generated instruction-
following data has been shown to improve zero-shot capabilities on new tasks, but the idea …

Eva-02: A visual representation for neon genesis

Y Fang, Q Sun, X Wang, T Huang, X Wang… - Image and Vision …, 2024 - Elsevier
We launch EVA-02, a next-generation Transformer-based visual representation pre-trained
to reconstruct strong and robust language-aligned vision features via masked image …

Mobileclip: Fast image-text models through multi-modal reinforced training

PKA Vasu, H Pouransari, F Faghri… - Proceedings of the …, 2024 - openaccess.thecvf.com
Contrastive pre-training of image-text foundation models such as CLIP demonstrated
excellent zero-shot performance and improved robustness on a wide range of downstream …

Backdoor Attacks to Deep Neural Networks: A Survey of the Literature, Challenges, and Future Research Directions

O Mengara, A Avila, TH Falk - IEEE Access, 2024 - ieeexplore.ieee.org
Deep neural network (DNN) classifiers are potent instruments that can be used in various
security-sensitive applications. Nonetheless, they are vulnerable to certain attacks that …

Weight subcloning: direct initialization of transformers using larger pretrained ones

M Samragh, M Farajtabar, S Mehta… - arXiv preprint arXiv …, 2023 - arxiv.org
Training large transformer models from scratch for a target task requires lots of data and is
computationally demanding. The usual practice of transfer learning overcomes this …

Good Teachers Explain: Explanation-Enhanced Knowledge Distillation

A Parchami-Araghi, M Böhle, S Rao… - arXiv preprint arXiv …, 2024 - arxiv.org
Knowledge Distillation (KD) has proven effective for compressing large teacher models into
smaller student models. While it is well known that student models can achieve similar …

Towards Deep Learning Models Resistant to Transfer-based Adversarial Attacks via Data-centric Robust Learning

Y Yang, C Lin, X Ji, Q Tian, Q Li, H Yang… - arXiv preprint arXiv …, 2023 - arxiv.org
Transfer-based adversarial attacks raise a severe threat to real-world deep learning systems
since they do not require access to target models. Adversarial training (AT), which is …

[PDF][PDF] Steerable Visual Intelligence

H Liu - 2024 - asset.library.wisc.edu
Image-text contrastive learning models such as CLIP have demonstrated strong task transfer
ability. The high generality and usability of these visual models is achieved via a web-scale …