Ecc: Platform-independent energy-constrained deep neural network compression via a bilinear...

H Cheng, M Zhang, JQ Shi - arXiv preprint arXiv:2308.06767, 2023 - arxiv.org

Modern deep neural networks, particularly recent large language models, come with
massive model sizes that require significant computational and storage resources. To …

被引用次数：23 相关文章所有 2 个版本

[PDF] ieee.org

Hardware and software optimizations for accelerating deep neural networks: Survey of current trends, challenges, and the road ahead

M Capra, B Bussolino, A Marchisio, G Masera… - IEEE …, 2020 - ieeexplore.ieee.org

Currently, Machine Learning (ML) is becoming ubiquitous in everyday life. Deep Learning
(DL) is already present in many applications ranging from computer vision for medicine to …

被引用次数：177 相关文章所有 9 个版本

[PDF] arxiv.org

EDEN: Enabling energy-efficient, high-performance deep neural network inference using approximate DRAM

S Koppula, L Orosa, AG Yağlıkçı, R Azizi… - Proceedings of the …, 2019 - dl.acm.org

The effectiveness of deep neural networks (DNN) in vision, speech, and language
processing has prompted a tremendous demand for energy-efficient high-performance DNN …

被引用次数：138 相关文章所有 10 个版本

[PDF] arxiv.org

Dsa: More efficient budgeted pruning via differentiable sparsity allocation

X Ning, T Zhao, W Li, P Lei, Y Wang, H Yang - European Conference on …, 2020 - Springer

Budgeted pruning is the problem of pruning under resource constraints. In budgeted
pruning, how to distribute the resources across layers (ie, sparsity allocation) is the key …

被引用次数：121 相关文章所有 11 个版本

[PDF] neurips.cc

Model compression with adversarial robustness: A unified optimization framework

S Gui, H Wang, H Yang, C Yu… - Advances in Neural …, 2019 - proceedings.neurips.cc

Deep model compression has been extensively studied, and state-of-the-art methods can
now achieve high compression ratios with minimal accuracy loss. This paper studies model …

被引用次数：153 相关文章所有 8 个版本

[PDF] thecvf.com

Gdp: Stabilized neural network pruning via gates with differentiable polarization

Y Guo, H Yuan, J Tan, Z Wang… - Proceedings of the …, 2021 - openaccess.thecvf.com

Abstract Model compression techniques are recently gaining explosive attention for
obtaining efficient AI models for various real time applications. Channel pruning is one …

被引用次数：53 相关文章所有 5 个版本

[PDF] arxiv.org

Dual-side sparse tensor core

Y Wang, C Zhang, Z Xie, C Guo, Y Liu… - 2021 ACM/IEEE 48th …, 2021 - ieeexplore.ieee.org

Leveraging sparsity in deep neural network (DNN) models is promising for accelerating
model inference. Yet existing GPUs can only leverage the sparsity from weights but not …

被引用次数：65 相关文章所有 11 个版本

[PDF] arxiv.org

Accelerating sparse dnn models without hardware-support via tile-wise sparsity

C Guo, BY Hsueh, J Leng, Y Qiu… - … Conference for High …, 2020 - ieeexplore.ieee.org

Network pruning can reduce the high computation cost of deep neural network (DNN)
models. However, to maintain their accuracies, sparse models often carry randomly …

被引用次数：82 相关文章所有 14 个版本

[PDF] thecvf.com

Fire together wire together: A dynamic pruning approach with self-supervised mask prediction

S Elkerdawy, M Elhoushi, H Zhang… - Proceedings of the …, 2022 - openaccess.thecvf.com

Dynamic model pruning is a recent direction that allows for the inference of a different sub-
network for each input sample during deployment. However, current dynamic methods rely …

被引用次数：37 相关文章所有 8 个版本

[PDF] arxiv.org

Gan slimming: All-in-one gan compression by a unified optimization framework

H Wang, S Gui, H Yang, J Liu, Z Wang - European Conference on …, 2020 - Springer

Generative adversarial networks (GANs) have gained increasing popularity in various
computer vision applications, and recently start to be deployed to resource-constrained …

被引用次数：68 相关文章所有 5 个版本

高级搜索

QQ 群