查看文章

academia.edu 中的 [PDF]

STQ-Nets: Unifying Network Binarization and Structured Pruning

作者

Sri Aurobindo Munagala, Ameya Prabhu, Anoop M Namboodiri

发表日期

2020/1/1

研讨会论文

BMVC

简介

We discuss a formulation for network compression combining two major paradigms: binarization and pruning. Past works on network binarization have demonstrated that networks are robust to the removal of activation/weight magnitude information, and can perform comparably to full-precision networks with signs alone. Pruning focuses on generating efficient and sparse networks. Both compression paradigms aid deployment in portable settings, where storage, compute and power are limited. We argue that these paradigms are complementary, and can be combined to offer high levels of compression and speedup without any significant accuracy loss. Intuitively, weights/activations closer to zero have higher binarization error making them good candidates for pruning. Our proposed formulation incorporates speedups from binary convolution algorithms through structured pruning, enabling the removal of pruned parts of the network entirely post-training, beating previous works attempting the same by a significant margin. Overall, our method brings up to 89x layer-wise compression over the corresponding full-precision networks–achieving only 0.33% loss on CIFAR-10 with ResNet-18 with a 40% PFR (Prune Factor Ratio for filters), and 0.3% on ImageNet with ResNet-18 with a 19% PFR.

引用总数

被引用次数：3

202220231 2

学术搜索中的文章

STQ-Nets: Unifying Network Binarization and Structured Pruning.

SA Munagala, A Prabhu, AM Namboodiri - BMVC, 2020

被引用次数：3 相关文章所有 3 个版本