Autoformer: Searching transformers for visual recognition

S Khan, M Naseer, M Hayat, SW Zamir… - ACM computing …, 2022 - dl.acm.org

Astounding results from Transformer models on natural language tasks have intrigued the
vision community to study their application to computer vision problems. Among their salient …

被引用次数：2206 相关文章所有 8 个版本

[PDF] researchgate.net

A review of convolutional neural network architectures and their optimizations

S Cong, Y Zhou - Artificial Intelligence Review, 2023 - Springer

The research advances concerning the typical architectures of convolutional neural
networks (CNNs) as well as their optimizations are analyzed and elaborated in detail in this …

被引用次数：91 相关文章所有 5 个版本

[HTML] ieee-jas.net

SwinFusion: Cross-domain long-range learning for general image fusion via swin transformer

J Ma, L Tang, F Fan, J Huang, X Mei… - IEEE/CAA Journal of …, 2022 - ieeexplore.ieee.org

This study proposes a novel general image fusion framework based on cross-domain long-
range learning and Swin Transformer, termed as SwinFusion. On the one hand, an attention …

被引用次数：431 相关文章所有 4 个版本

[PDF] thecvf.com

Efficientvit: Memory efficient vision transformer with cascaded group attention

X Liu, H Peng, N Zheng, Y Yang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Vision transformers have shown great success due to their high model capabilities.
However, their remarkable performance is accompanied by heavy computation costs, which …

被引用次数：122 相关文章所有 8 个版本

[PDF] neurips.cc

Efficientformer: Vision transformers at mobilenet speed

Y Li, G Yuan, Y Wen, J Hu… - Advances in …, 2022 - proceedings.neurips.cc

Abstract Vision Transformers (ViT) have shown rapid progress in computer vision tasks,
achieving promising results on various benchmarks. However, due to the massive number of …

被引用次数：240 相关文章所有 6 个版本

[PDF] arxiv.org

Transformers in time series: A survey

Q Wen, T Zhou, C Zhang, W Chen, Z Ma, J Yan… - arXiv preprint arXiv …, 2022 - arxiv.org

Transformers have achieved superior performances in many tasks in natural language
processing and computer vision, which also triggered great interest in the time series …

被引用次数：566 相关文章所有 6 个版本

[PDF] thecvf.com

Rethinking vision transformers for mobilenet size and speed

Y Li, J Hu, Y Wen, G Evangelidis… - Proceedings of the …, 2023 - openaccess.thecvf.com

With the success of Vision Transformers (ViTs) in computer vision tasks, recent arts try to
optimize the performance and complexity of ViTs to enable efficient deployment on mobile …

被引用次数：100 相关文章所有 5 个版本

[PDF] arxiv.org

Tinyvit: Fast pretraining distillation for small vision transformers

K Wu, J Zhang, H Peng, M Liu, B Xiao, J Fu… - European conference on …, 2022 - Springer

Vision transformer (ViT) recently has drawn great attention in computer vision due to its
remarkable model capability. However, most prevailing ViT models suffer from huge number …

被引用次数：146 相关文章所有 5 个版本

[PDF] thecvf.com

Rethinking and improving relative position encoding for vision transformer

K Wu, H Peng, M Chen, J Fu… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com

Relative position encoding (RPE) is important for transformer to capture sequence ordering
of input tokens. General efficacy has been proven in natural language processing. However …

被引用次数：301 相关文章所有 8 个版本

[PDF] baai.ac.cn

A survey on vision transformer

K Han, Y Wang, H Chen, X Chen, J Guo… - IEEE transactions on …, 2022 - ieeexplore.ieee.org

Transformer, first applied to the field of natural language processing, is a type of deep neural
network mainly based on the self-attention mechanism. Thanks to its strong representation …

被引用次数：1704 相关文章所有 7 个版本

高级搜索

QQ 群