A survey on deep neural network pruning: Taxonomy, comparison, analysis, and recommendations

H Cheng, M Zhang, JQ Shi - IEEE Transactions on Pattern …, 2024 - ieeexplore.ieee.org
Modern deep neural networks, particularly recent large language models, come with
massive model sizes that require significant computational and storage resources. To …

Perp: Rethinking the prune-retrain paradigm in the era of llms

M Zimmer, M Andoni, C Spiegel, S Pokutta - arXiv preprint arXiv …, 2023 - arxiv.org
Neural Networks can be efficiently compressed through pruning, significantly reducing
storage and computational demands while maintaining predictive performance. Simple yet …