Accurate medical image segmentation demands the integration of multi-scale information, spanning from local features to global dependencies. However, it is challenging for existing …
U-Net has become a cornerstone in various visual applications such as image segmentation and diffusion probability models. While numerous innovative designs and improvements …
Bird's eye view (BEV) semantic segmentation plays a crucial role in spatial sensing for autonomous driving. Although recent literature has made significant progress on BEV map …
Q Fan, H Huang, M Chen, H Liu… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Abstract Vision Transformer (ViT) has gained increasing attention in the computer vision community in recent years. However the core component of ViT Self-Attention lacks explicit …
I Pacal - Expert Systems with Applications, 2024 - Elsevier
The timely identification of diseases in maize leaf offers several benefits such as increased crop productivity, reduced reliance on harmful chemicals, and improved production of …
Self-attention has become a defacto choice for capturing global context in various vision applications. However, its quadratic computational complexity with respect to image …
We propose a novel hybrid Mamba-Transformer backbone, denoted as MambaVision, which is specifically tailored for vision applications. Our core contribution includes redesigning the …
The complicated architecture and high training cost of vision transformers urge the exploration of post-training quantization. However, the heavy-tailed distribution of vision …
We design a new family of hybrid CNN-ViT neural networks, named FasterViT, with a focus on high image throughput for computer vision (CV) applications. FasterViT combines the …