Involution: Inverting the inherence of convolution for visual recognition

P Wang, B Bayram, E Sertel - Earth-Science Reviews, 2022 - Elsevier

Satellite imageries are an important geoinformation source for different applications in the
Earth Science field. However, due to the limitation of the optic and sensor technologies and …

被引用次数：168 相关文章所有 5 个版本

[PDF] thecvf.com

The ninth NTIRE 2024 efficient super-resolution challenge report

B Ren, Y Li, N Mehta, R Timofte, H Yu… - Proceedings of the …, 2024 - openaccess.thecvf.com

This paper provides a comprehensive review of the NTIRE 2024 challenge focusing on
efficient single-image super-resolution (ESR) solutions and their outcomes. The task of this …

被引用次数：20 相关文章所有 3 个版本

[PDF] neurips.cc

Mlp-mixer: An all-mlp architecture for vision

IO Tolstikhin, N Houlsby, A Kolesnikov… - Advances in neural …, 2021 - proceedings.neurips.cc

Abstract Convolutional Neural Networks (CNNs) are the go-to model for computer vision.
Recently, attention-based networks, such as the Vision Transformer, have also become …

被引用次数：2552 相关文章所有 15 个版本

[PDF] neurips.cc

Focal modulation networks

J Yang, C Li, X Dai, J Gao - Advances in Neural Information …, 2022 - proceedings.neurips.cc

We propose focal modulation networks (FocalNets in short), where self-attention (SA) is
completely replaced by a focal modulation module for modeling token interactions in vision …

被引用次数：184 相关文章所有 6 个版本

[PDF] arxiv.org

Volo: Vision outlooker for visual recognition

L Yuan, Q Hou, Z Jiang, J Feng… - IEEE transactions on …, 2022 - ieeexplore.ieee.org

Recently, Vision Transformers (ViTs) have been broadly explored in visual recognition. With
low efficiency in encoding fine-level features, the performance of ViTs is still inferior to the …

被引用次数：303 相关文章所有 7 个版本

[PDF] mdpi.com

Plant disease recognition model based on improved YOLOv5

Z Chen, R Wu, Y Lin, C Li, S Chen, Z Yuan, S Chen… - Agronomy, 2022 - mdpi.com

To accurately recognize plant diseases under complex natural conditions, an improved plant
disease-recognition model based on the original YOLOv5 network model was established …

被引用次数：199 相关文章所有 7 个版本

[PDF] thecvf.com

Mixformer: Mixing features across windows and dimensions

Q Chen, Q Wu, J Wang, Q Hu, T Hu… - Proceedings of the …, 2022 - openaccess.thecvf.com

While local-window self-attention performs notably in vision tasks, it suffers from limited
receptive field and weak modeling capability issues. This is mainly because it performs self …

被引用次数：118 相关文章所有 6 个版本

[PDF] arxiv.org

More convnets in the 2020s: Scaling up kernels beyond 51x51 using sparsity

S Liu, T Chen, X Chen, X Chen, Q Xiao, B Wu… - arXiv preprint arXiv …, 2022 - arxiv.org

Transformers have quickly shined in the computer vision world since the emergence of
Vision Transformers (ViTs). The dominant role of convolutional neural networks (CNNs) …

被引用次数：146 相关文章所有 12 个版本

[PDF] thecvf.com

Action-net: Multipath excitation for action recognition

Z Wang, Q She, A Smolic - … of the IEEE/CVF conference on …, 2021 - openaccess.thecvf.com

Abstract Spatial-temporal, channel-wise, and motion patterns are three complementary and
crucial types of information for video action recognition. Conventional 2D CNNs are …

被引用次数：211 相关文章所有 9 个版本

Rachis detection and three-dimensional localization of cut off point for vision-based banana robot

F Wu, J Duan, P Ai, Z Chen, Z Yang, X Zou - Computers and Electronics in …, 2022 - Elsevier

For the operation and visual positioning of a banana robot, it is important to accurately
position the rachis and cut off point. However, the main factors that affect the three …

被引用次数：79 相关文章所有 5 个版本

高级搜索

QQ 群