G Hu, Z Wang, J Wei, W Huang, H Chen - arXiv preprint arXiv:2501.10054, 2025 - arxiv.org
Large language models (LLMs) demonstrate remarkable capabilities but face deployment
challenges due to their massive parameter counts. While existing compression techniques …