H Lin, H Xu, Y Wu, J Cui, Y Zhang, L Mou… - arXiv e …, 2024 - ui.adsabs.harvard.edu
Quantizing large language models (LLMs) presents significant challenges, primarily due to
outlier activations that compromise the efficiency of low-bit representation. Traditional …