K He, C Gan, Z Li, I Rekik, Z Yin, W Ji, Y Gao, Q Wang… - Intelligent …, 2023 - Elsevier
Transformers have dominated the field of natural language processing and have recently made an impact in the area of computer vision. In the field of medical image analysis …
Recently the state space models (SSMs) with efficient hardware-aware designs, ie, the Mamba deep learning model, have shown great potential for long sequence modeling …
We explore the plain, non-hierarchical Vision Transformer (ViT) as a backbone network for object detection. This design enables the original ViT architecture to be fine-tuned for object …
Network architecture plays a key role in the deep learning-based computer vision system. The widely-used convolutional neural network and transformer treat the image as a grid or …
Y Li, G Yuan, Y Wen, J Hu… - Advances in …, 2022 - proceedings.neurips.cc
Abstract Vision Transformers (ViT) have shown rapid progress in computer vision tasks, achieving promising results on various benchmarks. However, due to the massive number of …
We revisit large kernel design in modern convolutional neural networks (CNNs). Inspired by recent advances in vision transformers (ViTs), in this paper, we demonstrate that using a few …
While originally designed for natural language processing tasks, the self-attention mechanism has recently taken various computer vision areas by storm. However, the 2D …
JMJ Valanarasu, VM Patel - … conference on medical image computing and …, 2022 - Springer
UNet and its latest extensions like TransUNet have been the leading medical image segmentation methods in recent years. However, these networks cannot be effectively …
Abstract A Vision Transformer (ViT) is a simple neural architecture amenable to serve several computer vision tasks. It has limited built-in architectural priors, in contrast to more …