C Zeng,
S Liu, S Yang, F Chen,
X Mei, L Fu - arXiv preprint arXiv …, 2024 - arxiv.org
With the rapid growth in the scale and complexity of large language models (LLMs), the
costs of training and inference have risen substantially. Model compression has emerged as …