MM Bejani, M Ghatee - Artificial Intelligence Review, 2021 - Springer
Shallow neural networks process the features directly, while deep networks extract features automatically along with the training. Both models suffer from overfitting or poor …
G Fang, X Ma, M Song, MB Mi… - Proceedings of the …, 2023 - openaccess.thecvf.com
Structural pruning enables model acceleration by removing structurally-grouped parameters from neural networks. However, the parameter-grouping patterns vary widely across …
Transformers are slow and memory-hungry on long sequences, since the time and memory complexity of self-attention are quadratic in sequence length. Approximate attention …
Diffusion models are powerful, but they require a lot of time and data to train. We propose Patch Diffusion, a generic patch-wise training framework, to significantly reduce the training …
E Frantar, D Alistarh - Advances in Neural Information …, 2022 - proceedings.neurips.cc
We consider the problem of model compression for deep neural networks (DNNs) in the challenging one-shot/post-training setting, in which we are given an accurate trained model …
This chapter provides approaches to the problem of quantizing the numerical values in deep Neural Network computations, covering the advantages/disadvantages of current methods …
The growing energy and performance costs of deep learning have driven the community to reduce the size of neural networks by selectively pruning components. Similarly to their …
Pruning the parameters of deep neural networks has generated intense interest due to potential savings in time, memory and energy both during training and at test time. Recent …
Y Li, R Gong, X Tan, Y Yang, P Hu, Q Zhang… - arXiv preprint arXiv …, 2021 - arxiv.org
We study the challenging task of neural network quantization without end-to-end retraining, called Post-training Quantization (PTQ). PTQ usually requires a small subset of training data …