R Liu, Y Li, L Tao, D Liang, HT Zheng - Patterns, 2022 - cell.com
Recently, the proposed deep multilayer perceptron (MLP) models have stirred up a lot of interest in the vision community. Historically, the availability of larger datasets combined with …
JMJ Valanarasu, VM Patel - … conference on medical image computing and …, 2022 - Springer
UNet and its latest extensions like TransUNet have been the leading medical image segmentation methods in recent years. However, these networks cannot be effectively …
In this work, we introduce Dual Attention Vision Transformers (DaViT), a simple yet effective vision transformer architecture that is able to capture global context while maintaining …
Q Hou, CZ Lu, MM Cheng… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Vision Transformers have been the most popular network architecture in visual recognition recently due to the strong ability of encode global information. However, its high …
An Axial Shifted MLP architecture (AS-MLP) is proposed in this paper. Different from MLP- Mixer, where the global spatial feature is encoded for information flow through matrix …
In the field of computer vision, recent works show that a pure MLP architecture mainly stacked by fully-connected layers can achieve competing performance with CNN and …
Previous vision MLPs such as MLP-Mixer and ResMLP accept linearly flattened image patches as input, making them inflexible for different input sizes and hard to capture spatial …
MLP-Mixer has newly appeared as a new challenger against the realm of CNNs and Transformer. Despite its simplicity compared to Transformer, the concept of channel-mixing …
Y Tatsunami, M Taki - Advances in Neural Information …, 2022 - proceedings.neurips.cc
In recent computer vision research, the advent of the Vision Transformer (ViT) has rapidly revolutionized various architectural design efforts: ViT achieved state-of-the-art image …