- 学术资源搜索

Visual tuning

BXB Yu, J Chang, H Wang, L Liu, S Wang… - ACM Computing …, 2024 - dl.acm.org

Fine-tuning visual models has been widely shown promising performance on many
downstream visual tasks. With the surprising development of pre-trained visual foundation …

被引用次数：31 相关文章所有 4 个版本

[PDF] arxiv.org

A survey of techniques for optimizing transformer inference

KT Chitty-Venkata, S Mittal, M Emani… - Journal of Systems …, 2023 - Elsevier

Recent years have seen a phenomenal rise in the performance and applications of
transformer neural networks. The family of transformer networks, including Bidirectional …

被引用次数：65 相关文章所有 6 个版本

[PDF] baai.ac.cn

A survey on vision transformer

K Han, Y Wang, H Chen, X Chen, J Guo… - IEEE transactions on …, 2022 - ieeexplore.ieee.org

Transformer, first applied to the field of natural language processing, is a type of deep neural
network mainly based on the self-attention mechanism. Thanks to its strong representation …

被引用次数：2551 相关文章所有 7 个版本

[PDF] arxiv.org

A survey on visual transformer

K Han, Y Wang, H Chen, X Chen, J Guo, Z Liu… - arXiv preprint arXiv …, 2020 - arxiv.org

Transformer, first applied to the field of natural language processing, is a type of deep neural
network mainly based on the self-attention mechanism. Thanks to its strong representation …

被引用次数：387 相关文章所有 3 个版本

[PDF] thecvf.com

Minivit: Compressing vision transformers with weight multiplexing

J Zhang, H Peng, K Wu, M Liu… - Proceedings of the …, 2022 - openaccess.thecvf.com

Abstract Vision Transformer (ViT) models have recently drawn much attention in computer
vision due to their high model capability. However, ViT models suffer from huge number of …

被引用次数：135 相关文章所有 5 个版本

[PDF] arxiv.org

Spvit: Enabling faster vision transformers via latency-aware soft token pruning

Z Kong, P Dong, X Ma, X Meng, W Niu, M Sun… - European conference on …, 2022 - Springer

Abstract Recently, Vision Transformer (ViT) has continuously established new milestones in
the computer vision field, while the high computation and memory cost makes its …

被引用次数：191 相关文章所有 6 个版本

[PDF] neurips.cc

One-for-all: Bridge the gap between heterogeneous architectures in knowledge distillation

Z Hao, J Guo, K Han, Y Tang, H Hu… - Advances in Neural …, 2024 - proceedings.neurips.cc

Abstract Knowledge distillation (KD) has proven to be a highly effective approach for
enhancing model performance through a teacher-student training scheme. However, most …

被引用次数：42 相关文章所有 5 个版本

Gold-YOLO: Efficient object detector via gather-and-distribute mechanism

C Wang, W He, Y Nie, J Guo, C Liu… - Advances in Neural …, 2024 - proceedings.neurips.cc

In the past years, YOLO-series models have emerged as the leading approaches in the area
of real-time object detection. Many studies pushed up the baseline to a higher level by …

被引用次数：238 相关文章所有 6 个版本

[PDF] thecvf.com

I-vit: Integer-only quantization for efficient vision transformer inference

Z Li, Q Gu - Proceedings of the IEEE/CVF International …, 2023 - openaccess.thecvf.com

Abstract Vision Transformers (ViTs) have achieved state-of-the-art performance on various
computer vision applications. However, these models have considerable storage and …

被引用次数：93 相关文章所有 5 个版本

[PDF] thecvf.com

Diffrate: Differentiable compression rate for efficient vision transformers

M Chen, W Shao, P Xu, M Lin… - Proceedings of the …, 2023 - openaccess.thecvf.com

Token compression aims to speed up large-scale vision transformers (eg ViTs) by pruning
(dropping) or merging tokens. It is an important but challenging task. Although recent …

被引用次数：41 相关文章所有 5 个版本

高级搜索

QQ 群

Visual tuning

A survey of techniques for optimizing transformer inference

A survey on vision transformer

A survey on visual transformer

Minivit: Compressing vision transformers with weight multiplexing

Spvit: Enabling faster vision transformers via latency-aware soft token pruning

One-for-all: Bridge the gap between heterogeneous architectures in knowledge distillation

Gold-YOLO: Efficient object detector via gather-and-distribute mechanism

I-vit: Integer-only quantization for efficient vision transformer inference

Diffrate: Differentiable compression rate for efficient vision transformers

引用