- 学术资源搜索

Attention mechanisms in computer vision: A survey

MH Guo, TX Xu, JJ Liu, ZN Liu, PT Jiang, TJ Mu… - Computational visual …, 2022 - Springer

Humans can naturally and effectively find salient regions in complex scenes. Motivated by
this observation, attention mechanisms were introduced into computer vision with the aim of …

被引用次数：1503 相关文章所有 8 个版本

[HTML] sciencedirect.com

[HTML][HTML] Transformers in medical image analysis

K He, C Gan, Z Li, I Rekik, Z Yin, W Ji, Y Gao, Q Wang… - Intelligent …, 2023 - Elsevier

Transformers have dominated the field of natural language processing and have recently
made an impact in the area of computer vision. In the field of medical image analysis …

被引用次数：261 相关文章所有 12 个版本

[PDF] arxiv.org

Vision mamba: Efficient visual representation learning with bidirectional state space model

L Zhu, B Liao, Q Zhang, X Wang, W Liu… - arXiv preprint arXiv …, 2024 - arxiv.org

Recently the state space models (SSMs) with efficient hardware-aware designs, ie, the
Mamba deep learning model, have shown great potential for long sequence modeling …

被引用次数：408 相关文章所有 5 个版本

[PDF] arxiv.org

Exploring plain vision transformer backbones for object detection

Y Li, H Mao, R Girshick, K He - European conference on computer vision, 2022 - Springer

We explore the plain, non-hierarchical Vision Transformer (ViT) as a backbone network for
object detection. This design enables the original ViT architecture to be fine-tuned for object …

被引用次数：718 相关文章所有 6 个版本

[PDF] neurips.cc

Vision gnn: An image is worth graph of nodes

K Han, Y Wang, J Guo, Y Tang… - Advances in neural …, 2022 - proceedings.neurips.cc

Network architecture plays a key role in the deep learning-based computer vision system.
The widely-used convolutional neural network and transformer treat the image as a grid or …

被引用次数：287 相关文章所有 8 个版本

[PDF] neurips.cc

Efficientformer: Vision transformers at mobilenet speed

Y Li, G Yuan, Y Wen, J Hu… - Advances in …, 2022 - proceedings.neurips.cc

Abstract Vision Transformers (ViT) have shown rapid progress in computer vision tasks,
achieving promising results on various benchmarks. However, due to the massive number of …

被引用次数：276 相关文章所有 6 个版本

[PDF] thecvf.com

Scaling up your kernels to 31x31: Revisiting large kernel design in cnns

X Ding, X Zhang, J Han, G Ding - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com

We revisit large kernel design in modern convolutional neural networks (CNNs). Inspired by
recent advances in vision transformers (ViTs), in this paper, we demonstrate that using a few …

被引用次数：836 相关文章所有 10 个版本

[PDF] springer.com

Visual attention network

MH Guo, CZ Lu, ZN Liu, MM Cheng, SM Hu - Computational Visual Media, 2023 - Springer

While originally designed for natural language processing tasks, the self-attention
mechanism has recently taken various computer vision areas by storm. However, the 2D …

被引用次数：595 相关文章所有 8 个版本

[PDF] arxiv.org

Unext: Mlp-based rapid medical image segmentation network

JMJ Valanarasu, VM Patel - … conference on medical image computing and …, 2022 - Springer

UNet and its latest extensions like TransUNet have been the leading medical image
segmentation methods in recent years. However, these networks cannot be effectively …

被引用次数：500 相关文章所有 4 个版本

[PDF] arxiv.org

Deit iii: Revenge of the vit

H Touvron, M Cord, H Jégou - European conference on computer vision, 2022 - Springer

Abstract A Vision Transformer (ViT) is a simple neural architecture amenable to serve
several computer vision tasks. It has limited built-in architectural priors, in contrast to more …

被引用次数：311 相关文章所有 8 个版本

高级搜索

QQ 群

Attention mechanisms in computer vision: A survey

[HTML][HTML] Transformers in medical image analysis

Vision mamba: Efficient visual representation learning with bidirectional state space model

Exploring plain vision transformer backbones for object detection

Vision gnn: An image is worth graph of nodes

Efficientformer: Vision transformers at mobilenet speed

Scaling up your kernels to 31x31: Revisiting large kernel design in cnns

Visual attention network

Unext: Mlp-based rapid medical image segmentation network

Deit iii: Revenge of the vit

引用