S Cong, Y Zhou - Artificial Intelligence Review, 2023 - Springer
The research advances concerning the typical architectures of convolutional neural networks (CNNs) as well as their optimizations are analyzed and elaborated in detail in this …
This study proposes a novel general image fusion framework based on cross-domain long- range learning and Swin Transformer, termed as SwinFusion. On the one hand, an attention …
Vision transformers have shown great success due to their high model capabilities. However, their remarkable performance is accompanied by heavy computation costs, which …
Y Li, G Yuan, Y Wen, J Hu… - Advances in …, 2022 - proceedings.neurips.cc
Abstract Vision Transformers (ViT) have shown rapid progress in computer vision tasks, achieving promising results on various benchmarks. However, due to the massive number of …
Transformers have achieved superior performances in many tasks in natural language processing and computer vision, which also triggered great interest in the time series …
Y Li, J Hu, Y Wen, G Evangelidis… - Proceedings of the …, 2023 - openaccess.thecvf.com
With the success of Vision Transformers (ViTs) in computer vision tasks, recent arts try to optimize the performance and complexity of ViTs to enable efficient deployment on mobile …
Vision transformer (ViT) recently has drawn great attention in computer vision due to its remarkable model capability. However, most prevailing ViT models suffer from huge number …
K Wu, H Peng, M Chen, J Fu… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
Relative position encoding (RPE) is important for transformer to capture sequence ordering of input tokens. General efficacy has been proven in natural language processing. However …
Transformer, first applied to the field of natural language processing, is a type of deep neural network mainly based on the self-attention mechanism. Thanks to its strong representation …