Global context vision transformers

Comparison of vision transformers and convolutional neural networks in medical image analysis: a systematic review

S Takahashi, Y Sakaguchi, N Kouno… - Journal of Medical …, 2024 - Springer

In the rapidly evolving field of medical image analysis utilizing artificial intelligence (AI), the
selection of appropriate computational models is critical for accurate diagnosis and patient …

被引用次数：7 相关文章所有 6 个版本

[PDF] arxiv.org

Swin-umamba: Mamba-based unet with imagenet-based pretraining

J Liu, H Yang, HY Zhou, Y Xi, L Yu, C Li… - … Conference on Medical …, 2024 - Springer

Accurate medical image segmentation demands the integration of multi-scale information,
spanning from local features to global dependencies. However, it is challenging for existing …

被引用次数：118 相关文章所有 2 个版本

[PDF] arxiv.org

U-kan makes strong backbone for medical image segmentation and generation

C Li, X Liu, W Li, C Wang, H Liu, Y Liu, Z Chen… - arXiv preprint arXiv …, 2024 - arxiv.org

U-Net has become a cornerstone in various visual applications such as image segmentation
and diffusion probability models. While numerous innovative designs and improvements …

被引用次数：76 相关文章所有 2 个版本

[PDF] arxiv.org

Cobevt: Cooperative bird's eye view semantic segmentation with sparse transformers

R Xu, Z Tu, H Xiang, W Shao, B Zhou, J Ma - arXiv preprint arXiv …, 2022 - arxiv.org

Bird's eye view (BEV) semantic segmentation plays a crucial role in spatial sensing for
autonomous driving. Although recent literature has made significant progress on BEV map …

被引用次数：213 相关文章所有 5 个版本

[PDF] thecvf.com

Rmt: Retentive networks meet vision transformers

Q Fan, H Huang, M Chen, H Liu… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

Abstract Vision Transformer (ViT) has gained increasing attention in the computer vision
community in recent years. However the core component of ViT Self-Attention lacks explicit …

被引用次数：63 相关文章所有 3 个版本

Enhancing crop productivity and sustainability through disease identification in maize leaves: Exploiting a large dataset with an advanced vision transformer model

I Pacal - Expert Systems with Applications, 2024 - Elsevier

The timely identification of diseases in maize leaf offers several benefits such as increased
crop productivity, reduced reliance on harmful chemicals, and improved production of …

被引用次数：52 相关文章所有 2 个版本

[PDF] thecvf.com

Swiftformer: Efficient additive attention for transformer-based real-time mobile vision applications

A Shaker, M Maaz, H Rasheed… - Proceedings of the …, 2023 - openaccess.thecvf.com

Self-attention has become a defacto choice for capturing global context in various vision
applications. However, its quadratic computational complexity with respect to image …

被引用次数：93 相关文章所有 8 个版本

[PDF] arxiv.org

Mambavision: A hybrid mamba-transformer vision backbone

A Hatamizadeh, J Kautz - arXiv preprint arXiv:2407.08083, 2024 - arxiv.org

We propose a novel hybrid Mamba-Transformer backbone, denoted as MambaVision, which
is specifically tailored for vision applications. Our core contribution includes redesigning the …

被引用次数：32 相关文章所有 2 个版本

[PDF] thecvf.com

Noisyquant: Noisy bias-enhanced post-training activation quantization for vision transformers

Y Liu, H Yang, Z Dong, K Keutzer… - Proceedings of the …, 2023 - openaccess.thecvf.com

The complicated architecture and high training cost of vision transformers urge the
exploration of post-training quantization. However, the heavy-tailed distribution of vision …

被引用次数：41 相关文章所有 6 个版本

[PDF] arxiv.org

Fastervit: Fast vision transformers with hierarchical attention

A Hatamizadeh, G Heinrich, H Yin, A Tao… - arXiv preprint arXiv …, 2023 - arxiv.org

We design a new family of hybrid CNN-ViT neural networks, named FasterViT, with a focus
on high image throughput for computer vision (CV) applications. FasterViT combines the …

被引用次数：54 相关文章所有 3 个版本

高级搜索

QQ 群