Rotation and Permutation for Advanced Outlier Management and Efficient Quantization of LLMs

H Lin, H Xu, Y Wu, J Cui, Y Zhang, L Mou… - arXiv preprint arXiv …, 2024 - arxiv.org
Quantizing large language models (LLMs) presents significant challenges, primarily due to
outlier activations that compromise the efficiency of low-bit representation. Traditional …

Rotation and Permutation for Advanced Outlier Management and Efficient Quantization of LLMs

H Lin, H Xu, Y Wu, J Cui, Y Zhang, L Mou… - arXiv e …, 2024 - ui.adsabs.harvard.edu
Quantizing large language models (LLMs) presents significant challenges, primarily due to
outlier activations that compromise the efficiency of low-bit representation. Traditional …