Rmt: Retentive networks meet vision transformers

Q Fan, H Huang, M Chen, H Liu… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Abstract Vision Transformer (ViT) has gained increasing attention in the computer vision
community in recent years. However the core component of ViT Self-Attention lacks explicit …

DenseNets reloaded: paradigm shift beyond ResNets and ViTs

D Kim, B Heo, D Han - European Conference on Computer Vision, 2025 - Springer
Abstract This paper revives Densely Connected Convolutional Networks (DenseNets) and
reveals the underrated effectiveness over predominant ResNet-style architectures. We …

MambaOut: Do We Really Need Mamba for Vision?

W Yu, X Wang - arXiv preprint arXiv:2405.07992, 2024 - arxiv.org
Mamba, an architecture with RNN-like token mixer of state space model (SSM), was recently
introduced to address the quadratic complexity of the attention mechanism and …

PeLK: Parameter-efficient Large Kernel ConvNets with Peripheral Convolution

H Chen, X Chu, Y Ren, X Zhao… - Proceedings of the …, 2024 - openaccess.thecvf.com
Recently some large kernel convnets strike back with appealing performance and efficiency.
However given the square complexity of convolution scaling up kernels can bring about an …

Mixed receptive fields augmented YOLO with multi-path spatial pyramid pooling for steel surface defect detection

K Xia, Z Lv, C Zhou, G Gu, Z Zhao, K Liu, Z Li - Sensors, 2023 - mdpi.com
Aiming at the problems of low detection efficiency and poor detection accuracy caused by
texture feature interference and dramatic changes in the scale of defect on steel surfaces, an …

Multiscale low-frequency memory network for improved feature extraction in convolutional neural networks

F Wu, J Wu, Y Kong, C Yang, G Yang, H Shu… - Proceedings of the …, 2024 - ojs.aaai.org
Abstract Deep learning and Convolutional Neural Networks (CNNs) have driven major
transformations in diverse research areas. However, their limitations in handling low …

Gramian Attention Heads are Strong yet Efficient Vision Learners

J Ryu, D Han, J Lim - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
We introduce a novel architecture design that enhances expressiveness by incorporating
multiple head classifiers (ie, classification heads) instead of relying on channel expansion or …

Poly kernel inception network for remote sensing detection

X Cai, Q Lai, Y Wang, W Wang… - Proceedings of the …, 2024 - openaccess.thecvf.com
Object detection in remote sensing images (RSIs) often suffers from several increasing
challenges including the large variation in object scales and the diverse-ranging context …

MLP-based classification of COVID-19 and skin diseases

R Zhang, L Wang, S Cheng, S Song - Expert Systems with Applications, 2023 - Elsevier
Recent years have witnessed a growing interest in neural network-based medical image
classification methods, which have demonstrated remarkable performance in this field …

YOLOFM: an improved fire and smoke object detection algorithm based on YOLOv5n

X Geng, Y Su, X Cao, H Li, L Liu - Scientific Reports, 2024 - nature.com
To address the current difficulties in fire detection algorithms, including inadequate feature
extraction, excessive computational complexity, limited deployment on devices with limited …