Soft threshold weight reparameterization for learnable sparsity

Y He, L Xiao - IEEE transactions on pattern analysis and …, 2023 - ieeexplore.ieee.org

The remarkable performance of deep Convolutional neural networks (CNNs) is generally
attributed to their deeper and wider architectures, which can come with significant …

被引用次数：62 相关文章所有 7 个版本

[PDF] arxiv.org

Automl for deep recommender systems: A survey

R Zheng, L Qu, B Cui, Y Shi, H Yin - ACM Transactions on Information …, 2023 - dl.acm.org

Recommender systems play a significant role in information filtering and have been utilized
in different scenarios, such as e-commerce and social media. With the prosperity of deep …

被引用次数：68 相关文章所有 6 个版本

[PDF] arxiv.org

A simple and effective pruning approach for large language models

M Sun, Z Liu, A Bair, JZ Kolter - arXiv preprint arXiv:2306.11695, 2023 - arxiv.org

As their size increases, Large Languages Models (LLMs) are natural candidates for network
pruning methods: approaches that drop a subset of network weights while striving to …

被引用次数：205 相关文章所有 5 个版本

[PDF] jmlr.org

Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks

T Hoefler, D Alistarh, T Ben-Nun, N Dryden… - Journal of Machine …, 2021 - jmlr.org

The growing energy and performance costs of deep learning have driven the community to
reduce the size of neural networks by selectively pruning components. Similarly to their …

被引用次数：676 相关文章所有 27 个版本

[PDF] mlr.press

Rigging the lottery: Making all tickets winners

U Evci, T Gale, J Menick, PS Castro… - … on machine learning, 2020 - proceedings.mlr.press

Many applications require sparse neural networks due to space or inference time
restrictions. There is a large body of work on training dense networks to yield sparse …

被引用次数：543 相关文章所有 9 个版本

[PDF] mlr.press

Sparse invariant risk minimization

X Zhou, Y Lin, W Zhang… - … Conference on Machine …, 2022 - proceedings.mlr.press

Abstract Invariant Risk Minimization (IRM) is an emerging invariant feature extracting
technique to help generalization with distributional shift. However, we find that there exists a …

被引用次数：63 相关文章所有 4 个版本

[PDF] mlr.press

Train big, then compress: Rethinking model size for efficient training and inference of transformers

Z Li, E Wallace, S Shen, K Lin… - International …, 2020 - proceedings.mlr.press

Since hardware resources are limited, the objective of training deep learning models is
typically to maximize accuracy subject to the time and memory constraints of training and …

被引用次数：269 相关文章所有 11 个版本

[PDF] neurips.cc

Sparse training via boosting pruning plasticity with neuroregeneration

S Liu, T Chen, X Chen, Z Atashgahi… - Advances in …, 2021 - proceedings.neurips.cc

Works on lottery ticket hypothesis (LTH) and single-shot network pruning (SNIP) have raised
a lot of attention currently on post-training pruning (iterative magnitude pruning), and before …

被引用次数：111 相关文章所有 14 个版本

[PDF] mlr.press

Do we actually need dense over-parameterization? in-time over-parameterization in sparse training

S Liu, L Yin, DC Mocanu… - … on Machine Learning, 2021 - proceedings.mlr.press

In this paper, we introduce a new perspective on training deep neural networks capable of
state-of-the-art performance without the need for the expensive over-parameterization by …

被引用次数：109 相关文章所有 11 个版本

[PDF] arxiv.org

Layer-adaptive sparsity for the magnitude-based pruning

J Lee, S Park, S Mo, S Ahn, J Shin - arXiv preprint arXiv:2010.07611, 2020 - arxiv.org

Recent discoveries on neural network pruning reveal that, with a carefully chosen layerwise
sparsity, a simple magnitude-based pruning achieves state-of-the-art tradeoff between …

被引用次数：143 相关文章所有 4 个版本

高级搜索

QQ 群