Data-efficient structured pruning via submodular optimization

Rethinking the role of scale for in-context learning: An interpretability-based case study at 66 billion scale

H Bansal, K Gopalakrishnan, S Dingliwal… - arXiv preprint arXiv …, 2022 - arxiv.org

Language models have been shown to perform better with an increase in scale on a wide
variety of tasks via the in-context learning paradigm. In this paper, we investigate the …

被引用次数：35 相关文章所有 8 个版本

[PDF] mlr.press

Submodular minimax optimization: Finding effective sets

LR Mualem, ER Elenberg… - International …, 2024 - proceedings.mlr.press

Despite the rich existing literature about minimax optimization in continuous settings, only
very partial results of this kind have been obtained for combinatorial settings. In this paper …

被引用次数：5 相关文章所有 4 个版本

[PDF] acm.org

DASS: differentiable architecture search for sparse neural networks

H Mousavi, M Loni, M Alibeigi… - ACM Transactions on …, 2023 - dl.acm.org

The deployment of Deep Neural Networks (DNNs) on edge devices is hindered by the
substantial gap between performance requirements and available computational power …

被引用次数：8 相关文章所有 7 个版本

[PDF] arxiv.org

Sequential attention for feature selection

T Yasuda, MH Bateni, L Chen, M Fahrbach… - arXiv preprint arXiv …, 2022 - arxiv.org

Feature selection is the problem of selecting a subset of features for a machine learning
model that maximizes model quality subject to a budget constraint. For neural networks …

被引用次数：8 相关文章所有 7 个版本

[PDF] arxiv.org

A survey of lottery ticket hypothesis

B Liu, Z Zhang, P He, Z Wang, Y Xiao, R Ye… - arXiv preprint arXiv …, 2024 - arxiv.org

The Lottery Ticket Hypothesis (LTH) states that a dense neural network model contains a
highly sparse subnetwork (ie, winning tickets) that can achieve even better performance …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

Structured pruning of neural networks for constraints learning

M Cacciola, A Frangioni, A Lodi - arXiv preprint arXiv:2307.07457, 2023 - arxiv.org

In recent years, the integration of Machine Learning (ML) models with Operation Research
(OR) tools has gained popularity across diverse applications, including cancer treatment …

Deep neural networks pruning via the structured perspective regularization

M Cacciola, A Frangioni, X Li, A Lodi - SIAM Journal on Mathematics of Data …, 2023 - SIAM

In machine learning, artificial neural networks (ANNs) are a very powerful tool, broadly used
in many applications. Often, the selected (deep) architectures include many layers, and …

被引用次数：2 相关文章所有 9 个版本

高级搜索

QQ 群