Comparison of vision transformers and convolutional neural networks in medical image analysis: a systematic review

S Takahashi, Y Sakaguchi, N Kouno… - Journal of Medical …, 2024 - Springer
In the rapidly evolving field of medical image analysis utilizing artificial intelligence (AI), the
selection of appropriate computational models is critical for accurate diagnosis and patient …

Swin-umamba: Mamba-based unet with imagenet-based pretraining

J Liu, H Yang, HY Zhou, Y Xi, L Yu, C Li… - … Conference on Medical …, 2024 - Springer
Accurate medical image segmentation demands the integration of multi-scale information,
spanning from local features to global dependencies. However, it is challenging for existing …

U-kan makes strong backbone for medical image segmentation and generation

C Li, X Liu, W Li, C Wang, H Liu, Y Liu, Z Chen… - arXiv preprint arXiv …, 2024 - arxiv.org
U-Net has become a cornerstone in various visual applications such as image segmentation
and diffusion probability models. While numerous innovative designs and improvements …

Cobevt: Cooperative bird's eye view semantic segmentation with sparse transformers

R Xu, Z Tu, H Xiang, W Shao, B Zhou, J Ma - arXiv preprint arXiv …, 2022 - arxiv.org
Bird's eye view (BEV) semantic segmentation plays a crucial role in spatial sensing for
autonomous driving. Although recent literature has made significant progress on BEV map …

Rmt: Retentive networks meet vision transformers

Q Fan, H Huang, M Chen, H Liu… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Abstract Vision Transformer (ViT) has gained increasing attention in the computer vision
community in recent years. However the core component of ViT Self-Attention lacks explicit …

Enhancing crop productivity and sustainability through disease identification in maize leaves: Exploiting a large dataset with an advanced vision transformer model

I Pacal - Expert Systems with Applications, 2024 - Elsevier
The timely identification of diseases in maize leaf offers several benefits such as increased
crop productivity, reduced reliance on harmful chemicals, and improved production of …

Swiftformer: Efficient additive attention for transformer-based real-time mobile vision applications

A Shaker, M Maaz, H Rasheed… - Proceedings of the …, 2023 - openaccess.thecvf.com
Self-attention has become a defacto choice for capturing global context in various vision
applications. However, its quadratic computational complexity with respect to image …

Mambavision: A hybrid mamba-transformer vision backbone

A Hatamizadeh, J Kautz - arXiv preprint arXiv:2407.08083, 2024 - arxiv.org
We propose a novel hybrid Mamba-Transformer backbone, denoted as MambaVision, which
is specifically tailored for vision applications. Our core contribution includes redesigning the …

Noisyquant: Noisy bias-enhanced post-training activation quantization for vision transformers

Y Liu, H Yang, Z Dong, K Keutzer… - Proceedings of the …, 2023 - openaccess.thecvf.com
The complicated architecture and high training cost of vision transformers urge the
exploration of post-training quantization. However, the heavy-tailed distribution of vision …

Fastervit: Fast vision transformers with hierarchical attention

A Hatamizadeh, G Heinrich, H Yin, A Tao… - arXiv preprint arXiv …, 2023 - arxiv.org
We design a new family of hybrid CNN-ViT neural networks, named FasterViT, with a focus
on high image throughput for computer vision (CV) applications. FasterViT combines the …