A comprehensive review on deep learning based remote sensing image super-resolution methods

P Wang, B Bayram, E Sertel - Earth-Science Reviews, 2022 - Elsevier
Satellite imageries are an important geoinformation source for different applications in the
Earth Science field. However, due to the limitation of the optic and sensor technologies and …

The ninth NTIRE 2024 efficient super-resolution challenge report

B Ren, Y Li, N Mehta, R Timofte, H Yu… - Proceedings of the …, 2024 - openaccess.thecvf.com
This paper provides a comprehensive review of the NTIRE 2024 challenge focusing on
efficient single-image super-resolution (ESR) solutions and their outcomes. The task of this …

Mlp-mixer: An all-mlp architecture for vision

IO Tolstikhin, N Houlsby, A Kolesnikov… - Advances in neural …, 2021 - proceedings.neurips.cc
Abstract Convolutional Neural Networks (CNNs) are the go-to model for computer vision.
Recently, attention-based networks, such as the Vision Transformer, have also become …

Focal modulation networks

J Yang, C Li, X Dai, J Gao - Advances in Neural Information …, 2022 - proceedings.neurips.cc
We propose focal modulation networks (FocalNets in short), where self-attention (SA) is
completely replaced by a focal modulation module for modeling token interactions in vision …

Volo: Vision outlooker for visual recognition

L Yuan, Q Hou, Z Jiang, J Feng… - IEEE transactions on …, 2022 - ieeexplore.ieee.org
Recently, Vision Transformers (ViTs) have been broadly explored in visual recognition. With
low efficiency in encoding fine-level features, the performance of ViTs is still inferior to the …

Plant disease recognition model based on improved YOLOv5

Z Chen, R Wu, Y Lin, C Li, S Chen, Z Yuan, S Chen… - Agronomy, 2022 - mdpi.com
To accurately recognize plant diseases under complex natural conditions, an improved plant
disease-recognition model based on the original YOLOv5 network model was established …

Mixformer: Mixing features across windows and dimensions

Q Chen, Q Wu, J Wang, Q Hu, T Hu… - Proceedings of the …, 2022 - openaccess.thecvf.com
While local-window self-attention performs notably in vision tasks, it suffers from limited
receptive field and weak modeling capability issues. This is mainly because it performs self …

More convnets in the 2020s: Scaling up kernels beyond 51x51 using sparsity

S Liu, T Chen, X Chen, X Chen, Q Xiao, B Wu… - arXiv preprint arXiv …, 2022 - arxiv.org
Transformers have quickly shined in the computer vision world since the emergence of
Vision Transformers (ViTs). The dominant role of convolutional neural networks (CNNs) …

Action-net: Multipath excitation for action recognition

Z Wang, Q She, A Smolic - … of the IEEE/CVF conference on …, 2021 - openaccess.thecvf.com
Abstract Spatial-temporal, channel-wise, and motion patterns are three complementary and
crucial types of information for video action recognition. Conventional 2D CNNs are …

Rachis detection and three-dimensional localization of cut off point for vision-based banana robot

F Wu, J Duan, P Ai, Z Chen, Z Yang, X Zou - Computers and Electronics in …, 2022 - Elsevier
For the operation and visual positioning of a banana robot, it is important to accurately
position the rachis and cut off point. However, the main factors that affect the three …