Transform quantization for CNN compression

SI Young, W Zhe, D Taubman… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
In this paper, we compress convolutional neural network (CNN) weights post-training via
transform quantization. Previous CNN quantization techniques tend to ignore the joint …

Optimal gradient compression for distributed and federated learning

A Albasyoni, M Safaryan, L Condat… - arXiv preprint arXiv …, 2020 - arxiv.org
Communicating information, like gradient vectors, between computing nodes in distributed
and federated learning is typically an unavoidable burden, resulting in scalability issues …

A gradient flow framework for analyzing network pruning

ES Lubana, RP Dick - arXiv preprint arXiv:2009.11839, 2020 - arxiv.org
Recent network pruning methods focus on pruning models early-on in training. To estimate
the impact of removing a parameter, these methods use importance measures that were …

An information-theoretic justification for model pruning

B Isik, T Weissman, A No - International Conference on …, 2022 - proceedings.mlr.press
We study the neural network (NN) compression problem, viewing the tension between the
compression ratio and NN performance through the lens of rate-distortion theory. We choose …

Finite blocklength lossy source coding for discrete memoryless sources

L Zhou, M Motani - Foundations and Trends® in …, 2023 - nowpublishers.com
Shannon propounded a theoretical framework (collectively called information theory) that
uses mathematical tools to understand, model and analyze modern mobile wireless …

Fundamental limitation of semantic communications: Neural estimation for rate-distortion

D Li, J Huang, C Huang, X Qin… - Journal of …, 2023 - ieeexplore.ieee.org
This paper studies the fundamental limit of semantic communications over the discrete
memoryless channel. We consider the scenario to send a semantic source consisting of an …

Rdo-q: Extremely fine-grained channel-wise quantization via rate-distortion optimization

Z Wang, J Lin, X Geng, MMS Aly… - European Conference on …, 2022 - Springer
Allocating different bit widths to different channels and quantizing them independently bring
higher quantization precision and accuracy. Most of prior works use equal bit width to …

Population risk improvement with model compression: An information-theoretic approach

Y Bu, W Gao, S Zou, VV Veeravalli - Entropy, 2021 - mdpi.com
It has been reported in many recent works on deep model compression that the population
risk of a compressed model can be even better than that of the original model. In this paper …

Taxonomy and evaluation of structured compression of convolutional neural networks

A Kuzmin, M Nagel, S Pitre, S Pendyam… - arXiv preprint arXiv …, 2019 - arxiv.org
The success of deep neural networks in many real-world applications is leading to new
challenges in building more efficient architectures. One effective way of making networks …

On distributed quantization for classification

OA Hanna, YH Ezzeldin, T Sadjadpour… - IEEE Journal on …, 2020 - ieeexplore.ieee.org
We consider the problem of distributed feature quantization, where the goal is to enable a
pretrained classifier at a central node to carry out its classification on features that are …