W Ma, Y Li, X Jia, W Xu - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
Abstract Visual Transformers (ViTs) and Convolutional Neural Networks (CNNs) are the two primary backbone structures extensively used in various vision tasks. Generating …
Abstract Vision Transformer (ViT), as a powerful alternative to Convolutional Neural Network (CNN), has received much attention. Recent work showed that ViTs are also vulnerable to …
While adversarial training has been extensively studied for ResNet architectures and low resolution datasets like CIFAR-10, much less is known for ImageNet. Given the recent …
Y Guo, D Stutz, B Schiele - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
Despite the success of vision transformers (ViTs), they still suffer from significant drops in accuracy in the presence of common corruptions, such as noise or blur. Interestingly, we …
J Gu, V Tresp, Y Qin - European Conference on Computer Vision, 2022 - Springer
Abstract Recent advances in Vision Transformer (ViT) have demonstrated its impressive performance in image classification, which makes it a promising alternative to Convolutional …
Z Yuan, P Zhou, K Zou… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Abstract Vision Transformers (ViTs), which made a splash in the field of computer vision (CV), have shaken the dominance of convolutional neural networks (CNNs). However, in the …
Neural architectures based on attention such as vision transformers are revolutionizing image recognition. Their main benefit is that attention allows reasoning about all parts of a …
Z Niu, H Ren, X Gao, G Hua, R Jin - arXiv preprint arXiv:2402.02309, 2024 - arxiv.org
This paper focuses on jailbreaking attacks against multi-modal large language models (MLLMs), seeking to elicit MLLMs to generate objectionable responses to harmful user …
Abstract Vision Transformers (ViT) are competing to replace Convolutional Neural Networks (CNN) for various computer vision tasks in medical imaging such as classification and …