To prune, or not to prune: exploring the efficacy of pruning for model compression

H Cheng, M Zhang, JQ Shi - IEEE Transactions on Pattern …, 2024 - ieeexplore.ieee.org

Modern deep neural networks, particularly recent large language models, come with
massive model sizes that require significant computational and storage resources. To …

被引用次数：115 相关文章所有 2 个版本

[PDF] arxiv.org

Efficient deep learning: A survey on making deep learning models smaller, faster, and better

G Menghani - ACM Computing Surveys, 2023 - dl.acm.org

Deep learning has revolutionized the fields of computer vision, natural language
understanding, speech recognition, information retrieval, and more. However, with the …

被引用次数：426 相关文章所有 6 个版本

[PDF] mlr.press

Sparsegpt: Massive language models can be accurately pruned in one-shot

E Frantar, D Alistarh - International Conference on Machine …, 2023 - proceedings.mlr.press

We show for the first time that large-scale generative pretrained transformer (GPT) family
models can be pruned to at least 50% sparsity in one-shot, without any retraining, at minimal …

被引用次数：502 相关文章所有 8 个版本

[PDF] arxiv.org

A simple and effective pruning approach for large language models

M Sun, Z Liu, A Bair, JZ Kolter - arXiv preprint arXiv:2306.11695, 2023 - arxiv.org

As their size increases, Large Languages Models (LLMs) are natural candidates for network
pruning methods: approaches that drop a subset of network weights while striving to …

被引用次数：413 相关文章所有 5 个版本

[PDF] thecvf.com

Ego-exo4d: Understanding skilled human activity from first-and third-person perspectives

K Grauman, A Westbury, L Torresani… - Proceedings of the …, 2024 - openaccess.thecvf.com

Abstract We present Ego-Exo4D a diverse large-scale multimodal multiview video dataset
and benchmark challenge. Ego-Exo4D centers around simultaneously-captured egocentric …

被引用次数：126 相关文章所有 5 个版本

[PDF] neurips.cc

Optimal brain compression: A framework for accurate post-training quantization and pruning

E Frantar, D Alistarh - Advances in Neural Information …, 2022 - proceedings.neurips.cc

We consider the problem of model compression for deep neural networks (DNNs) in the
challenging one-shot/post-training setting, in which we are given an accurate trained model …

被引用次数：210 相关文章所有 5 个版本

[PDF] openreview.net

Language models are super mario: Absorbing abilities from homologous models as a free lunch

L Yu, B Yu, H Yu, F Huang, Y Li - Forty-first International Conference …, 2024 - openreview.net

In this paper, we unveil that Language Models (LMs) can acquire new capabilities by
assimilating parameters from homologous models without retraining or GPUs. We first …

被引用次数：160 相关文章所有 3 个版本

[PDF] thecvf.com

NTIRE 2023 challenge on efficient super-resolution: Methods and results

Y Li, Y Zhang, R Timofte, L Van Gool… - Proceedings of the …, 2023 - openaccess.thecvf.com

This paper reviews the NTIRE 2023 challenge on efficient single-image super-resolution
with a focus on the proposed solutions and results. The aim of this challenge is to devise a …

被引用次数：142 相关文章所有 12 个版本

[PDF] arxiv.org

M6-rec: Generative pretrained language models are open-ended recommender systems

Z Cui, J Ma, C Zhou, J Zhou, H Yang - arXiv preprint arXiv:2205.08084, 2022 - arxiv.org

Industrial recommender systems have been growing increasingly complex, may
involve\emph {diverse domains} such as e-commerce products and user-generated …

被引用次数：188 相关文章所有 2 个版本

[PDF] mdpi.com

A survey on efficient convolutional neural networks and hardware acceleration

D Ghimire, D Kil, S Kim - Electronics, 2022 - mdpi.com

Over the past decade, deep-learning-based representations have demonstrated remarkable
performance in academia and industry. The learning capability of convolutional neural …

被引用次数：183 相关文章所有 5 个版本

高级搜索

QQ 群