Large models based on the Transformer architecture play increasingly vital roles in artificial intelligence, particularly within the realms of natural language processing (NLP) and …
Y He, L Liu, J Liu, W Wu, H Zhou… - Advances in Neural …, 2024 - proceedings.neurips.cc
Diffusion models have recently dominated image synthesis and other related generative tasks. However, the iterative denoising process is expensive in computations at inference …
Since the introduction of the Vision Transformer (ViT), researchers have sought to make ViTs more efficient by removing redundant information in the processed tokens. While different …
Z Li, J Xiao, L Yang, Q Gu - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
Abstract Post-training quantization (PTQ), which only requires a tiny dataset for calibration without end-to-end retraining, is a light and practical model compression technique …
Z Li, Q Gu - Proceedings of the IEEE/CVF International …, 2023 - openaccess.thecvf.com
Abstract Vision Transformers (ViTs) have achieved state-of-the-art performance on various computer vision applications. However, these models have considerable storage and …
The complicated architecture and high training cost of vision transformers urge the exploration of post-training quantization. However, the heavy-tailed distribution of vision …
Z Pan, J Cai, B Zhuang - … of the IEEE/CVF Conference on …, 2023 - openaccess.thecvf.com
The public model zoo containing enormous powerful pretrained model families (eg, ResNet/DeiT) has reached an unprecedented scope than ever, which significantly …
N Frumkin, D Gope… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Quantization scale and bit-width are the most important parameters when considering how to quantize a neural network. Prior work focuses on optimizing quantization scales in a …
P Dong, L Lu, C Wu, C Lyu, G Yuan… - Advances in Neural …, 2024 - proceedings.neurips.cc
Abstract While Vision Transformers (ViTs) have undoubtedly made impressive strides in computer vision (CV), their intricate network structures necessitate substantial computation …