Deep neural network quantization via layer-wise optimization using limited training data

H Bai, L Hou, L Shang, X Jiang… - Advances in neural …, 2022 - proceedings.neurips.cc

Network quantization has gained increasing attention with the rapid growth of large pre-
trained language models~(PLMs). However, most existing quantization methods for PLMs …

被引用次数：58 相关文章所有 5 个版本

[PDF] arxiv.org

Extreme compression of large language models via additive quantization

V Egiazarian, A Panferov, D Kuznedelev… - arXiv preprint arXiv …, 2024 - arxiv.org

The emergence of accurate open large language models (LLMs) has led to a race towards
quantization techniques for such models enabling execution on end-user devices. In this …

被引用次数：48 相关文章所有 3 个版本

[PDF] arxiv.org

Transform quantization for CNN compression

SI Young, W Zhe, D Taubman… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org

In this paper, we compress convolutional neural network (CNN) weights post-training via
transform quantization. Previous CNN quantization techniques tend to ignore the joint …

被引用次数：87 相关文章所有 9 个版本

[PDF] aaai.org

Few shot network compression via cross distillation

H Bai, J Wu, I King, M Lyu - Proceedings of the AAAI Conference on …, 2020 - aaai.org

Abstract Model compression has been widely adopted to obtain light-weighted deep neural
networks. Most prevalent methods, however, require fine-tuning with sufficient training data …

被引用次数：66 相关文章所有 6 个版本

FPFS: Filter-level pruning via distance weight measuring filter similarity

W Zhang, Z Wang - Neurocomputing, 2022 - Elsevier

Abstract Deep Neural Networks (DNNs) enjoy the welfare of convolution, while also bearing
huge computational pressure. Therefore, model compression techniques are used to …

被引用次数：19 相关文章所有 2 个版本

MedQ: Lossless ultra-low-bit neural network quantization for medical image segmentation

R Zhang, ACS Chung - Medical Image Analysis, 2021 - Elsevier

Implementing deep convolutional neural networks (CNNs) with boolean arithmetic is ideal
for eliminating the notoriously high computational expense of deep learning models …

被引用次数：29 相关文章所有 4 个版本

[PDF] researchgate.net

Heterogeneous model fusion federated learning mechanism based on model mapping

X Lu, Y Liao, C Liu, P Lio, P Hui - IEEE Internet of Things …, 2021 - ieeexplore.ieee.org

The computing power of various Internet of Things (IoT) devices is quite different. To enable
IoT devices with lower computing power to perform machine learning, all nodes can only …

被引用次数：27 相关文章所有 3 个版本

[PDF] thecvf.com

Fixed-point back-propagation training

X Zhang, S Liu, R Zhang, C Liu… - Proceedings of the …, 2020 - openaccess.thecvf.com

Recent emerged quantization technique (ie, using low bit-width fixed-point data instead of
high bit-width floating-point data) has been applied to inference of deep neural networks for …

被引用次数：40 相关文章所有 6 个版本

[PDF] arxiv.org

A Simple Low-bit Quantization Framework for Video Snapshot Compressive Imaging

M Cao, L Wang, H Wang, X Yuan - arXiv preprint arXiv:2407.21517, 2024 - arxiv.org

Video Snapshot Compressive Imaging (SCI) aims to use a low-speed 2D camera to capture
high-speed scene as snapshot compressed measurements, followed by a reconstruction …

被引用次数：1 相关文章所有 4 个版本

Towards efficient network compression via Few-Shot Slimming

J He, Y Ding, M Zhang, D Li - Neural Networks, 2022 - Elsevier

While previous network compression methods achieve great success, most of them rely on
the abundant training data which is, unfortunately, often unavailable in practice due to some …

被引用次数：6 相关文章所有 4 个版本

高级搜索

QQ 群