Attention mechanisms in computer vision: A survey

MH Guo, TX Xu, JJ Liu, ZN Liu, PT Jiang, TJ Mu… - Computational visual …, 2022 - Springer
Humans can naturally and effectively find salient regions in complex scenes. Motivated by
this observation, attention mechanisms were introduced into computer vision with the aim of …

[HTML][HTML] Review of image classification algorithms based on convolutional neural networks

L Chen, S Li, Q Bai, J Yang, S Jiang, Y Miao - Remote Sensing, 2021 - mdpi.com
Image classification has always been a hot research direction in the world, and the
emergence of deep learning has promoted the development of this field. Convolutional …

Convnext v2: Co-designing and scaling convnets with masked autoencoders

S Woo, S Debnath, R Hu, X Chen… - Proceedings of the …, 2023 - openaccess.thecvf.com
Driven by improved architectures and better representation learning frameworks, the field of
visual recognition has enjoyed rapid modernization and performance boost in the early …

Segnext: Rethinking convolutional attention design for semantic segmentation

MH Guo, CZ Lu, Q Hou, Z Liu… - Advances in Neural …, 2022 - proceedings.neurips.cc
We present SegNeXt, a simple convolutional network architecture for semantic
segmentation. Recent transformer-based models have dominated the field of se-mantic …

Visual attention network

MH Guo, CZ Lu, ZN Liu, MM Cheng, SM Hu - Computational Visual Media, 2023 - Springer
While originally designed for natural language processing tasks, the self-attention
mechanism has recently taken various computer vision areas by storm. However, the 2D …

Efficient multi-scale attention module with cross-spatial learning

D Ouyang, S He, G Zhang, M Luo… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
Remarkable effectiveness of the channel or spatial attention mechanisms for producing
more discernible feature representation are illustrated in various computer vision tasks …

Davit: Dual attention vision transformers

M Ding, B Xiao, N Codella, P Luo, J Wang… - European conference on …, 2022 - Springer
In this work, we introduce Dual Attention Vision Transformers (DaViT), a simple yet effective
vision transformer architecture that is able to capture global context while maintaining …

Camouflaged object detection with feature decomposition and edge reconstruction

C He, K Li, Y Zhang, L Tang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Camouflaged object detection (COD) aims to address the tough issue of identifying
camouflaged objects visually blended into the surrounding backgrounds. COD is a …

Uctransnet: rethinking the skip connections in u-net from a channel-wise perspective with transformer

H Wang, P Cao, J Wang, OR Zaiane - Proceedings of the AAAI …, 2022 - ojs.aaai.org
Most recent semantic segmentation methods adopt a U-Net framework with an encoder-
decoder architecture. It is still challenging for U-Net with a simple skip connection scheme to …

A lightweight vehicles detection network model based on YOLOv5

X Dong, S Yan, C Duan - Engineering Applications of Artificial Intelligence, 2022 - Elsevier
Vehicle detection technology is of great significance for realizing automatic monitoring and
AI-assisted driving systems. The state-of-the-art object detection method, namely, a class of …