Hire-mlp: Vision mlp via hierarchical rearrangement

Transforming medical imaging with Transformers? A comparative review of key properties, current progresses, and future perspectives

J Li, J Chen, Y Tang, C Wang, BA Landman… - Medical image …, 2023 - Elsevier

Transformer, one of the latest technological advances of deep learning, has gained
prevalence in natural language processing or computer vision. Since medical imaging bear …

被引用次数：146 相关文章所有 9 个版本

[PDF] cell.com Full View

Are we ready for a new paradigm shift? a survey on visual deep mlp

R Liu, Y Li, L Tao, D Liang, HT Zheng - Patterns, 2022 - cell.com

Recently, the proposed deep multilayer perceptron (MLP) models have stirred up a lot of
interest in the vision community. Historically, the availability of larger datasets combined with …

被引用次数：70 相关文章所有 7 个版本

[PDF] neurips.cc

Vision gnn: An image is worth graph of nodes

K Han, Y Wang, J Guo, Y Tang… - Advances in neural …, 2022 - proceedings.neurips.cc

Network architecture plays a key role in the deep learning-based computer vision system.
The widely-used convolutional neural network and transformer treat the image as a grid or …

被引用次数：272 相关文章所有 8 个版本

[PDF] thecvf.com

Cmt: Convolutional neural networks meet vision transformers

J Guo, K Han, H Wu, Y Tang, X Chen… - Proceedings of the …, 2022 - openaccess.thecvf.com

Vision transformers have been successfully applied to image recognition tasks due to their
ability to capture long-range dependencies within an image. However, there are still gaps in …

被引用次数：657 相关文章所有 6 个版本

[PDF] neurips.cc

Focal modulation networks

J Yang, C Li, X Dai, J Gao - Advances in Neural Information …, 2022 - proceedings.neurips.cc

We propose focal modulation networks (FocalNets in short), where self-attention (SA) is
completely replaced by a focal modulation module for modeling token interactions in vision …

被引用次数：170 相关文章所有 6 个版本

[PDF] ieee.org

Transformer-based visual segmentation: A survey

X Li, H Ding, H Yuan, W Zhang, J Pang… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org

Visual segmentation seeks to partition images, video frames, or point clouds into multiple
segments or groups. This technique has numerous real-world applications, such as …

被引用次数：60 相关文章所有 3 个版本

[PDF] thecvf.com

An image patch is a wave: Phase-aware vision mlp

Y Tang, K Han, J Guo, C Xu, Y Li… - Proceedings of the …, 2022 - openaccess.thecvf.com

In the field of computer vision, recent works show that a pure MLP architecture mainly
stacked by fully-connected layers can achieve competing performance with CNN and …

被引用次数：120 相关文章所有 5 个版本

[PDF] neurips.cc

One-for-all: Bridge the gap between heterogeneous architectures in knowledge distillation

Z Hao, J Guo, K Han, Y Tang, H Hu… - Advances in Neural …, 2024 - proceedings.neurips.cc

Abstract Knowledge distillation (KD) has proven to be a highly effective approach for
enhancing model performance through a teacher-student training scheme. However, most …

被引用次数：14 相关文章所有 5 个版本

[PDF] arxiv.org

Pointmixer: Mlp-mixer for point cloud understanding

J Choe, C Park, F Rameau, J Park… - European Conference on …, 2022 - Springer

MLP-Mixer has newly appeared as a new challenger against the realm of CNNs and
Transformer. Despite its simplicity compared to Transformer, the concept of channel-mixing …

被引用次数：90 相关文章所有 7 个版本

[PDF] neurips.cc

Sequencer: Deep lstm for image classification

Y Tatsunami, M Taki - Advances in Neural Information …, 2022 - proceedings.neurips.cc

In recent computer vision research, the advent of the Vision Transformer (ViT) has rapidly
revolutionized various architectural design efforts: ViT achieved state-of-the-art image …

被引用次数：57 相关文章所有 5 个版本

高级搜索

QQ 群