Importance estimation for neural network pruning

H Cheng, M Zhang, JQ Shi - IEEE Transactions on Pattern …, 2024 - ieeexplore.ieee.org

Modern deep neural networks, particularly recent large language models, come with
massive model sizes that require significant computational and storage resources. To …

被引用次数：90 相关文章所有 2 个版本

[PDF] arxiv.org

Structured pruning for deep convolutional neural networks: A survey

Y He, L Xiao - IEEE transactions on pattern analysis and …, 2023 - ieeexplore.ieee.org

The remarkable performance of deep Convolutional neural networks (CNNs) is generally
attributed to their deeper and wider architectures, which can come with significant …

被引用次数：139 相关文章所有 7 个版本

[PDF] mit.edu

A survey on model compression for large language models

X Zhu, J Li, Y Liu, C Ma, W Wang - Transactions of the Association for …, 2024 - direct.mit.edu

Abstract Large Language Models (LLMs) have transformed natural language processing
tasks successfully. Yet, their large size and high computational needs pose challenges for …

被引用次数：205 相关文章所有 2 个版本

[PDF] thecvf.com

Depgraph: Towards any structural pruning

G Fang, X Ma, M Song, MB Mi… - Proceedings of the …, 2023 - openaccess.thecvf.com

Structural pruning enables model acceleration by removing structurally-grouped parameters
from neural networks. However, the parameter-grouping patterns vary widely across …

被引用次数：335 相关文章所有 7 个版本

[PDF] thecvf.com

Efficientvit: Memory efficient vision transformer with cascaded group attention

X Liu, H Peng, N Zheng, Y Yang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Vision transformers have shown great success due to their high model capabilities.
However, their remarkable performance is accompanied by heavy computation costs, which …

被引用次数：293 相关文章所有 8 个版本

[PDF] arxiv.org

A simple and effective pruning approach for large language models

M Sun, Z Liu, A Bair, JZ Kolter - arXiv preprint arXiv:2306.11695, 2023 - arxiv.org

As their size increases, Large Languages Models (LLMs) are natural candidates for network
pruning methods: approaches that drop a subset of network weights while striving to …

被引用次数：387 相关文章所有 5 个版本

[PDF] neurips.cc

Patch diffusion: Faster and more data-efficient training of diffusion models

Z Wang, Y Jiang, H Zheng, P Wang… - Advances in neural …, 2024 - proceedings.neurips.cc

Diffusion models are powerful, but they require a lot of time and data to train. We propose
Patch Diffusion, a generic patch-wise training framework, to significantly reduce the training …

被引用次数：175 相关文章所有 11 个版本

[PDF] arxiv.org

AdaLoRA: Adaptive budget allocation for parameter-efficient fine-tuning

Q Zhang, M Chen, A Bukharin… - arXiv preprint arXiv …, 2023 - arxiv.org

Fine-tuning large pre-trained language models on downstream tasks has become an
important paradigm in NLP. However, common practice fine-tunes all of the parameters in a …

被引用次数：404 相关文章所有 4 个版本

[PDF] jmlr.org

Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks

T Hoefler, D Alistarh, T Ben-Nun, N Dryden… - Journal of Machine …, 2021 - jmlr.org

The growing energy and performance costs of deep learning have driven the community to
reduce the size of neural networks by selectively pruning components. Similarly to their …

被引用次数：829 相关文章所有 27 个版本

[PDF] arxiv.org

Structured pruning learns compact and accurate models

M Xia, Z Zhong, D Chen - arXiv preprint arXiv:2204.00408, 2022 - arxiv.org

The growing size of neural language models has led to increased attention in model
compression. The two predominant approaches are pruning, which gradually removes …

被引用次数：222 相关文章所有 7 个版本

高级搜索

QQ 群