Transforming medical imaging with Transformers? A comparative review of key properties, current progresses, and future perspectives

J Li, J Chen, Y Tang, C Wang, BA Landman… - Medical image …, 2023 - Elsevier
Transformer, one of the latest technological advances of deep learning, has gained
prevalence in natural language processing or computer vision. Since medical imaging bear …

A review of convolutional neural network architectures and their optimizations

S Cong, Y Zhou - Artificial Intelligence Review, 2023 - Springer
The research advances concerning the typical architectures of convolutional neural
networks (CNNs) as well as their optimizations are analyzed and elaborated in detail in this …

Vision transformer adapter for dense predictions

Z Chen, Y Duan, W Wang, J He, T Lu, J Dai… - arXiv preprint arXiv …, 2022 - arxiv.org
This work investigates a simple yet powerful adapter for Vision Transformer (ViT). Unlike
recent visual transformers that introduce vision-specific inductive biases into their …

On the integration of self-attention and convolution

X Pan, C Ge, R Lu, S Song, G Chen… - Proceedings of the …, 2022 - openaccess.thecvf.com
Convolution and self-attention are two powerful techniques for representation learning, and
they are usually considered as two peer approaches that are distinct from each other. In this …

Mobile-former: Bridging mobilenet and transformer

Y Chen, X Dai, D Chen, M Liu… - Proceedings of the …, 2022 - openaccess.thecvf.com
Abstract We present Mobile-Former, a parallel design of MobileNet and transformer with a
two-way bridge in between. This structure leverages the advantages of MobileNet at local …

Vitaev2: Vision transformer advanced by exploring inductive bias for image recognition and beyond

Q Zhang, Y Xu, J Zhang, D Tao - International Journal of Computer Vision, 2023 - Springer
Vision transformers have shown great potential in various computer vision tasks owing to
their strong capability to model long-range dependency using the self-attention mechanism …

Vitae: Vision transformer advanced by exploring intrinsic inductive bias

Y Xu, Q Zhang, J Zhang, D Tao - Advances in neural …, 2021 - proceedings.neurips.cc
Transformers have shown great potential in various computer vision tasks owing to their
strong capability in modeling long-range dependency using the self-attention mechanism …

Hrformer: High-resolution vision transformer for dense predict

Y Yuan, R Fu, L Huang, W Lin… - Advances in neural …, 2021 - proceedings.neurips.cc
Abstract We present a High-Resolution Transformer (HRFormer) that learns high-resolution
representations for dense prediction tasks, in contrast to the original Vision Transformer that …

Hrformer: High-resolution transformer for dense prediction

Y Yuan, R Fu, L Huang, W Lin, C Zhang… - arXiv preprint arXiv …, 2021 - arxiv.org
We present a High-Resolution Transformer (HRFormer) that learns high-resolution
representations for dense prediction tasks, in contrast to the original Vision Transformer that …

Mixformer: Mixing features across windows and dimensions

Q Chen, Q Wu, J Wang, Q Hu, T Hu… - Proceedings of the …, 2022 - openaccess.thecvf.com
While local-window self-attention performs notably in vision tasks, it suffers from limited
receptive field and weak modeling capability issues. This is mainly because it performs self …