Hessian-aware pruning and optimal neural implant

Y He, L Xiao - IEEE transactions on pattern analysis and …, 2023 - ieeexplore.ieee.org

The remarkable performance of deep Convolutional neural networks (CNNs) is generally
attributed to their deeper and wider architectures, which can come with significant …

被引用次数：147 相关文章所有 7 个版本

[HTML] frontiersin.org

[HTML][HTML] Applications and techniques for fast machine learning in science

AMC Deiana, N Tran, J Agar, M Blott… - Frontiers in big …, 2022 - frontiersin.org

In this community review report, we discuss applications and techniques for fast machine
learning (ML) in science—the concept of integrating powerful ML methods into the real-time …

被引用次数：65 相关文章所有 27 个版本

[PDF] arxiv.org

A survey of quantization methods for efficient neural network inference

A Gholami, S Kim, Z Dong, Z Yao… - Low-Power Computer …, 2022 - taylorfrancis.com

This chapter provides approaches to the problem of quantizing the numerical values in deep
Neural Network computations, covering the advantages/disadvantages of current methods …

被引用次数：1323 相关文章所有 4 个版本

[PDF] arxiv.org

Squeezellm: Dense-and-sparse quantization

S Kim, C Hooper, A Gholami, Z Dong, X Li… - arXiv preprint arXiv …, 2023 - arxiv.org

Generative Large Language Models (LLMs) have demonstrated remarkable results for a
wide range of tasks. However, deploying these models for inference has been a significant …

被引用次数：159 相关文章所有 4 个版本

LungNet: A hybrid deep-CNN model for lung cancer diagnosis using CT and wearable sensor-based medical IoT data

N Faruqui, MA Yousuf, M Whaiduzzaman… - Computers in Biology …, 2021 - Elsevier

Lung cancer, also known as pulmonary cancer, is one of the deadliest cancers, but yet
curable if detected at the early stage. At present, the ambiguous features of the lung cancer …

被引用次数：153 相关文章所有 10 个版本

[PDF] arxiv.org

The optimal bert surgeon: Scalable and accurate second-order pruning for large language models

E Kurtic, D Campos, T Nguyen, E Frantar… - arXiv preprint arXiv …, 2022 - arxiv.org

Transformer-based language models have become a key building block for natural
language processing. While these models are extremely accurate, they can be too large and …

被引用次数：125 相关文章所有 3 个版本

[PDF] arxiv.org

Full stack optimization of transformer inference: a survey

S Kim, C Hooper, T Wattanawong, M Kang… - arXiv preprint arXiv …, 2023 - arxiv.org

Recent advances in state-of-the-art DNN architecture design have been moving toward
Transformer models. These models achieve superior accuracy across a wide range of …

被引用次数：90 相关文章所有 4 个版本

[PDF] arxiv.org

Squant: On-the-fly data-free quantization via diagonal hessian approximation

C Guo, Y Qiu, J Leng, X Gao, C Zhang, Y Liu… - arXiv preprint arXiv …, 2022 - arxiv.org

Quantization of deep neural networks (DNN) has been proven effective for compressing and
accelerating DNN models. Data-free quantization (DFQ) is a promising approach without the …

被引用次数：70 相关文章所有 7 个版本

[PDF] academia.edu

A comprehensive survey on model quantization for deep neural networks in image classification

B Rokh, A Azarpeyvand, A Khanteymoori - ACM Transactions on …, 2023 - dl.acm.org

Recent advancements in machine learning achieved by Deep Neural Networks (DNNs)
have been significant. While demonstrating high accuracy, DNNs are associated with a …

被引用次数：71 相关文章

[PDF] arxiv.org

A comprehensive survey on model quantization for deep neural networks

B Rokh, A Azarpeyvand, A Khanteymoori - arXiv preprint arXiv …, 2022 - arxiv.org

Recent advances in machine learning by deep neural networks are significant. But using
these networks has been accompanied by a huge number of parameters for storage and …

被引用次数：25 相关文章所有 2 个版本

高级搜索

QQ 群