D Zhou, B Kang, X Jin, L Yang, X Lian, Z Jiang… - arXiv preprint arXiv …, 2021 - arxiv.org
Vision transformers (ViTs) have been successfully applied in image classification tasks
recently. In this paper, we show that, unlike convolution neural networks (CNNs) that can be …