Learning to prune deep neural networks via layer-wise optimal brain surgeon

G Menghani - ACM Computing Surveys, 2023 - dl.acm.org

Deep learning has revolutionized the fields of computer vision, natural language
understanding, speech recognition, information retrieval, and more. However, with the …

被引用次数：331 相关文章所有 6 个版本

A systematic review on overfitting control in shallow and deep neural networks

MM Bejani, M Ghatee - Artificial Intelligence Review, 2021 - Springer

Shallow neural networks process the features directly, while deep networks extract features
automatically along with the training. Both models suffer from overfitting or poor …

被引用次数：305 相关文章所有 4 个版本

[PDF] thecvf.com

Depgraph: Towards any structural pruning

G Fang, X Ma, M Song, MB Mi… - Proceedings of the …, 2023 - openaccess.thecvf.com

Structural pruning enables model acceleration by removing structurally-grouped parameters
from neural networks. However, the parameter-grouping patterns vary widely across …

被引用次数：252 相关文章所有 7 个版本

[PDF] neurips.cc

Flashattention: Fast and memory-efficient exact attention with io-awareness

T Dao, D Fu, S Ermon, A Rudra… - Advances in Neural …, 2022 - proceedings.neurips.cc

Transformers are slow and memory-hungry on long sequences, since the time and memory
complexity of self-attention are quadratic in sequence length. Approximate attention …

被引用次数：1208 相关文章所有 10 个版本

[PDF] neurips.cc

Patch diffusion: Faster and more data-efficient training of diffusion models

Z Wang, Y Jiang, H Zheng, P Wang… - Advances in neural …, 2024 - proceedings.neurips.cc

Diffusion models are powerful, but they require a lot of time and data to train. We propose
Patch Diffusion, a generic patch-wise training framework, to significantly reduce the training …

被引用次数：117 相关文章所有 11 个版本

[PDF] neurips.cc

Optimal brain compression: A framework for accurate post-training quantization and pruning

E Frantar, D Alistarh - Advances in Neural Information …, 2022 - proceedings.neurips.cc

We consider the problem of model compression for deep neural networks (DNNs) in the
challenging one-shot/post-training setting, in which we are given an accurate trained model …

被引用次数：154 相关文章所有 5 个版本

[PDF] arxiv.org

A survey of quantization methods for efficient neural network inference

A Gholami, S Kim, Z Dong, Z Yao… - Low-Power Computer …, 2022 - taylorfrancis.com

This chapter provides approaches to the problem of quantizing the numerical values in deep
Neural Network computations, covering the advantages/disadvantages of current methods …

被引用次数：1136 相关文章所有 4 个版本

[PDF] jmlr.org

Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks

T Hoefler, D Alistarh, T Ben-Nun, N Dryden… - Journal of Machine …, 2021 - jmlr.org

The growing energy and performance costs of deep learning have driven the community to
reduce the size of neural networks by selectively pruning components. Similarly to their …

被引用次数：739 相关文章所有 27 个版本

[PDF] neurips.cc

Pruning neural networks without any data by iteratively conserving synaptic flow

H Tanaka, D Kunin, DL Yamins… - Advances in neural …, 2020 - proceedings.neurips.cc

Pruning the parameters of deep neural networks has generated intense interest due to
potential savings in time, memory and energy both during training and at test time. Recent …

被引用次数：654 相关文章所有 8 个版本

[PDF] arxiv.org

Brecq: Pushing the limit of post-training quantization by block reconstruction

Y Li, R Gong, X Tan, Y Yang, P Hu, Q Zhang… - arXiv preprint arXiv …, 2021 - arxiv.org

We study the challenging task of neural network quantization without end-to-end retraining,
called Post-training Quantization (PTQ). PTQ usually requires a small subset of training data …

被引用次数：365 相关文章所有 4 个版本

高级搜索

QQ 群