Discrete representations strengthen vision transformer robustness

C Mao, L Jiang, M Dehghani, C Vondrick… - arXiv preprint arXiv …, 2021 - arxiv.org
Vision Transformer (ViT) is emerging as the state-of-the-art architecture for image
recognition. While recent studies suggest that ViTs are more robust than their convolutional …

Discrete Representations Strengthen Vision Transformer Robustness

C Mao, L Jiang, M Dehghani, C Vondrick… - … Conference on Learning … - openreview.net
Vision Transformer (ViT) is emerging as the state-of-the-art architecture for image
recognition. While recent studies suggest that ViTs are more robust than their convolutional …

Discrete Representations Strengthen Vision Transformer Robustness

C Mao, L Jiang, M Dehghani, C Vondrick… - arXiv e …, 2021 - ui.adsabs.harvard.edu
Abstract Vision Transformer (ViT) is emerging as the state-of-the-art architecture for image
recognition. While recent studies suggest that ViTs are more robust than their convolutional …

Discrete Representations Strengthen Vision Transformer Robustness

C Mao, L Jiang, M Dehghani, CM Vondrick… - research.google
Abstract Vision Transformer (ViT) is emerging as the state-of-the-art architecture for image
recognition. While recent studies suggest that ViTs are more robust than their convolutional …