A survey on deep neural network pruning-taxonomy, comparison, analysis, and recommendations

H Cheng, M Zhang, JQ Shi - arXiv preprint arXiv:2308.06767, 2023 - arxiv.org
Modern deep neural networks, particularly recent large language models, come with
massive model sizes that require significant computational and storage resources. To …

Spvit: Enabling faster vision transformers via latency-aware soft token pruning

Z Kong, P Dong, X Ma, X Meng, W Niu, M Sun… - European conference on …, 2022 - Springer
Abstract Recently, Vision Transformer (ViT) has continuously established new milestones in
the computer vision field, while the high computation and memory cost makes its …

Advancing model pruning via bi-level optimization

Y Zhang, Y Yao, P Ram, P Zhao… - Advances in …, 2022 - proceedings.neurips.cc
The deployment constraints in practical applications necessitate the pruning of large-scale
deep learning models, ie, promoting their weight sparsity. As illustrated by the Lottery Ticket …

Compute-efficient deep learning: Algorithmic trends and opportunities

BR Bartoldson, B Kailkhura, D Blalock - Journal of Machine Learning …, 2023 - jmlr.org
Although deep learning has made great progress in recent years, the exploding economic
and environmental costs of training neural networks are becoming unsustainable. To …

Trainability preserving neural pruning

H Wang, Y Fu - arXiv preprint arXiv:2207.12534, 2022 - arxiv.org
Many recent works have shown trainability plays a central role in neural network pruning--
unattended broken trainability can lead to severe under-performance and unintentionally …

Exposing and exploiting fine-grained block structures for fast and accurate sparse training

P Jiang, L Hu, S Song - Advances in Neural Information …, 2022 - proceedings.neurips.cc
Sparse training is a popular technique to reduce the overhead of training large models.
Although previous work has shown promising results for nonstructured sparse models, it is …

Towards sparsification of graph neural networks

H Peng, D Gurevin, S Huang, T Geng… - 2022 IEEE 40th …, 2022 - ieeexplore.ieee.org
As real-world graphs expand in size, larger GNN models with billions of parameters are
deployed. High parameter count in such models makes training and inference on graphs …

Dynamic sparsity is channel-level sparsity learner

L Yin, G Li, M Fang, L Shen, T Huang… - Advances in …, 2024 - proceedings.neurips.cc
Sparse training has received an upsurging interest in machine learning due to its tantalizing
saving potential for both the entire training process as well as the inference. Dynamic sparse …

[PDF][PDF] Audio lottery: Speech recognition made ultra-lightweight, noise-robust, and transferable

S Ding, T Chen, Z Wang - International Conference on Learning …, 2022 - par.nsf.gov
Lightweight speech recognition models have seen explosive demands owing to a growing
amount of speech-interactive features on mobile devices. Since designing such systems …

Sparsity winning twice: Better robust generalization from more efficient training

T Chen, Z Zhang, P Wang, S Balachandra… - arXiv preprint arXiv …, 2022 - arxiv.org
Recent studies demonstrate that deep networks, even robustified by the state-of-the-art
adversarial training (AT), still suffer from large robust generalization gaps, in addition to the …