- 学术资源搜索

A survey of techniques for optimizing transformer inference

KT Chitty-Venkata, S Mittal, M Emani… - Journal of Systems …, 2023 - Elsevier

Recent years have seen a phenomenal rise in the performance and applications of
transformer neural networks. The family of transformer networks, including Bidirectional …

被引用次数：66 相关文章所有 6 个版本

[PDF] arxiv.org

A survey on transformer compression

Y Tang, Y Wang, J Guo, Z Tu, K Han, H Hu… - arXiv preprint arXiv …, 2024 - arxiv.org

Large models based on the Transformer architecture play increasingly vital roles in artificial
intelligence, particularly within the realms of natural language processing (NLP) and …

被引用次数：27 相关文章所有 2 个版本

[PDF] neurips.cc

Ptqd: Accurate post-training quantization for diffusion models

Y He, L Liu, J Liu, W Wu, H Zhou… - Advances in Neural …, 2024 - proceedings.neurips.cc

Diffusion models have recently dominated image synthesis and other related generative
tasks. However, the iterative denoising process is expensive in computations at inference …

被引用次数：74 相关文章所有 5 个版本

[PDF] thecvf.com

Which tokens to use? investigating token reduction in vision transformers

JB Haurum, S Escalera, GW Taylor… - Proceedings of the …, 2023 - openaccess.thecvf.com

Since the introduction of the Vision Transformer (ViT), researchers have sought to make ViTs
more efficient by removing redundant information in the processed tokens. While different …

被引用次数：35 相关文章所有 6 个版本

[PDF] thecvf.com

Repq-vit: Scale reparameterization for post-training quantization of vision transformers

Z Li, J Xiao, L Yang, Q Gu - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com

Abstract Post-training quantization (PTQ), which only requires a tiny dataset for calibration
without end-to-end retraining, is a light and practical model compression technique …

被引用次数：77 相关文章所有 6 个版本

[PDF] thecvf.com

I-vit: Integer-only quantization for efficient vision transformer inference

Z Li, Q Gu - Proceedings of the IEEE/CVF International …, 2023 - openaccess.thecvf.com

Abstract Vision Transformers (ViTs) have achieved state-of-the-art performance on various
computer vision applications. However, these models have considerable storage and …

被引用次数：96 相关文章所有 5 个版本

[PDF] thecvf.com

Noisyquant: Noisy bias-enhanced post-training activation quantization for vision transformers

Y Liu, H Yang, Z Dong, K Keutzer… - Proceedings of the …, 2023 - openaccess.thecvf.com

The complicated architecture and high training cost of vision transformers urge the
exploration of post-training quantization. However, the heavy-tailed distribution of vision …

被引用次数：44 相关文章所有 6 个版本

[PDF] thecvf.com

Stitchable neural networks

Z Pan, J Cai, B Zhuang - … of the IEEE/CVF Conference on …, 2023 - openaccess.thecvf.com

The public model zoo containing enormous powerful pretrained model families (eg,
ResNet/DeiT) has reached an unprecedented scope than ever, which significantly …

被引用次数：31 相关文章所有 6 个版本

[PDF] thecvf.com

Jumping through local minima: Quantization in the loss landscape of vision transformers

N Frumkin, D Gope… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Quantization scale and bit-width are the most important parameters when considering how
to quantize a neural network. Prior work focuses on optimizing quantization scales in a …

被引用次数：19 相关文章所有 5 个版本

[PDF] neurips.cc

Packqvit: Faster sub-8-bit vision transformers via full and packed quantization on the mobile

P Dong, L Lu, C Wu, C Lyu, G Yuan… - Advances in Neural …, 2024 - proceedings.neurips.cc

Abstract While Vision Transformers (ViTs) have undoubtedly made impressive strides in
computer vision (CV), their intricate network structures necessitate substantial computation …

被引用次数：14 相关文章所有 3 个版本

高级搜索

QQ 群

A survey of techniques for optimizing transformer inference

A survey on transformer compression

Ptqd: Accurate post-training quantization for diffusion models

Which tokens to use? investigating token reduction in vision transformers

Repq-vit: Scale reparameterization for post-training quantization of vision transformers

I-vit: Integer-only quantization for efficient vision transformer inference

Noisyquant: Noisy bias-enhanced post-training activation quantization for vision transformers

Stitchable neural networks

Jumping through local minima: Quantization in the loss landscape of vision transformers

Packqvit: Faster sub-8-bit vision transformers via full and packed quantization on the mobile

引用