Structured sparsity inducing adaptive optimizers for deep learning

T Chen, B Ji, T Ding, B Fang, G Wang… - Advances in …, 2021 - proceedings.neurips.cc

Structured pruning is a commonly used technique in deploying deep neural networks
(DNNs) onto resource-constrained devices. However, the existing pruning methods are …

被引用次数：110 相关文章所有 9 个版本

[PDF] arxiv.org

Otov2: Automatic, generic, user-friendly

T Chen, L Liang, T Ding, Z Zhu, I Zharkov - arXiv preprint arXiv …, 2023 - arxiv.org

The existing model compression methods via structured pruning typically require
complicated multi-stage procedures. Each individual stage necessitates numerous …

被引用次数：33 相关文章所有 5 个版本

Sparsity in transformers: A systematic literature review

M Farina, U Ahmad, A Taha, H Younes, Y Mesbah… - Neurocomputing, 2024 - Elsevier

Transformers have become the state-of-the-art architectures for various tasks in Natural
Language Processing (NLP) and Computer Vision (CV); however, their space and …

被引用次数：10 相关文章

[PDF] openreview.net

Learning pruning-friendly networks via frank-wolfe: One-shot, any-sparsity, and no retraining

M Lu, X Luo, T Chen, W Chen, D Liu… - … Conference on Learning …, 2022 - openreview.net

We present a novel framework to train a large deep neural network (DNN) for only $\textit
{once} $, which can then be pruned to $\textit {any sparsity ratio} $ to preserve competitive …

被引用次数：29 相关文章所有 2 个版本

[PDF] neurips.cc

The contextual lasso: Sparse linear models via deep neural networks

R Thompson, A Dezfouli… - Advances in Neural …, 2023 - proceedings.neurips.cc

Sparse linear models are one of several core tools for interpretable machine learning, a field
of emerging importance as predictive models permeate decision-making in many domains …

被引用次数：6 相关文章所有 5 个版本

[PDF] mlr.press

Less is More–Towards parsimonious multi-task models using structured sparsity

R Upadhyay, R Phlypo, R Saini… - … on Parsimony and …, 2024 - proceedings.mlr.press

Abstract Model sparsification in deep learning promotes simpler, more interpretable models
with fewer parameters. This not only reduces the model's memory footprint and …

被引用次数：2 相关文章所有 7 个版本

[PDF] arxiv.org

Training structured neural networks through manifold identification and variance reduction

ZS Huang, C Lee - arXiv preprint arXiv:2112.02612, 2021 - arxiv.org

This paper proposes an algorithm (RMDA) for training neural networks (NNs) with a
regularization term for promoting desired structures. RMDA does not incur computation …

被引用次数：8 相关文章所有 4 个版本

[PDF] arxiv.org

Modeling ideological salience and framing in polarized online groups with graph neural networks and structured sparsity

V Hofmann, X Dong, JB Pierrehumbert… - arXiv preprint arXiv …, 2021 - arxiv.org

The increasing polarization of online political discourse calls for computational tools that
automatically detect and monitor ideological divides in social media. We introduce a …

被引用次数：12 相关文章所有 8 个版本

[PDF] uni-wuerzburg.de

Proximal methods for nonconvex composite optimization problems

T Lechner - 2022 - opus.bibliothek.uni-wuerzburg.de

Optimization problems with composite functions deal with the minimization of the sum of a
smooth function and a convex nonsmooth function. In this thesis several numerical methods …

被引用次数：4 相关文章所有 2 个版本

[PDF] arxiv.org

HESSO: Towards Automatic Efficient and User Friendly Any Neural Network Training and Pruning

T Chen, X Qu, D Aponte, C Banbury, J Ko… - arXiv preprint arXiv …, 2024 - arxiv.org

Structured pruning is one of the most popular approaches to effectively compress the heavy
deep neural networks (DNNs) into compact sub-networks while retaining performance. The …

高级搜索

QQ 群