How deep learning sees the world: A survey on adversarial attacks & defenses

JC Costa, T Roxo, H Proença, PRM Inácio - IEEE Access, 2024 - ieeexplore.ieee.org
Deep Learning is currently used to perform multiple tasks, such as object recognition, face
recognition, and natural language processing. However, Deep Neural Networks (DNNs) are …

Transferable adversarial attack for both vision transformers and convolutional networks via momentum integrated gradients

W Ma, Y Li, X Jia, W Xu - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
Abstract Visual Transformers (ViTs) and Convolutional Neural Networks (CNNs) are the two
primary backbone structures extensively used in various vision tasks. Generating …

Towards efficient adversarial training on vision transformers

B Wu, J Gu, Z Li, D Cai, X He, W Liu - European Conference on Computer …, 2022 - Springer
Abstract Vision Transformer (ViT), as a powerful alternative to Convolutional Neural Network
(CNN), has received much attention. Recent work showed that ViTs are also vulnerable to …

Revisiting adversarial training for imagenet: Architectures, training and generalization across threat models

ND Singh, F Croce, M Hein - Advances in Neural …, 2024 - proceedings.neurips.cc
While adversarial training has been extensively studied for ResNet architectures and low
resolution datasets like CIFAR-10, much less is known for ImageNet. Given the recent …

Robustifying token attention for vision transformers

Y Guo, D Stutz, B Schiele - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
Despite the success of vision transformers (ViTs), they still suffer from significant drops in
accuracy in the presence of common corruptions, such as noise or blur. Interestingly, we …

Are vision transformers robust to patch perturbations?

J Gu, V Tresp, Y Qin - European Conference on Computer Vision, 2022 - Springer
Abstract Recent advances in Vision Transformer (ViT) have demonstrated its impressive
performance in image classification, which makes it a promising alternative to Convolutional …

You Are Catching My Attention: Are Vision Transformers Bad Learners under Backdoor Attacks?

Z Yuan, P Zhou, K Zou… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Abstract Vision Transformers (ViTs), which made a splash in the field of computer vision
(CV), have shaken the dominance of convolutional neural networks (CNNs). However, in the …

Give me your attention: Dot-product attention considered harmful for adversarial patch robustness

G Lovisotto, N Finnie, M Munoz… - Proceedings of the …, 2022 - openaccess.thecvf.com
Neural architectures based on attention such as vision transformers are revolutionizing
image recognition. Their main benefit is that attention allows reasoning about all parts of a …

Jailbreaking attack against multimodal large language model

Z Niu, H Ren, X Gao, G Hua, R Jin - arXiv preprint arXiv:2402.02309, 2024 - arxiv.org
This paper focuses on jailbreaking attacks against multi-modal large language models
(MLLMs), seeking to elicit MLLMs to generate objectionable responses to harmful user …

Self-ensembling vision transformer (sevit) for robust medical image classification

F Almalik, M Yaqub, K Nandakumar - International Conference on Medical …, 2022 - Springer
Abstract Vision Transformers (ViT) are competing to replace Convolutional Neural Networks
(CNN) for various computer vision tasks in medical imaging such as classification and …