查看文章

Simulating quantized inference on convolutional neural networks

作者

Vitor Finotti, Bruno Albertini

发表日期

2021/10/1

期刊

Computers and Electrical Engineering

卷号

页码范围

107446

出版商

Pergamon

简介

Mobile and embedded applications of convolutional neural networks (CNNs) use quantization to reduce model size and increase computational efficiency. However, working with quantized networks often implies using non-standard training and execution methods, as modern frameworks offer limited support to fixed-point operations. We propose a quantization approach simulating the effects of quantization in CNN inference without needing to be re-implemented using fixed-point arithmetic, reducing overhead and complexity in evaluating existing networks’ responses to quantization. The proposed method provides a fast way of performing post-training quantization with different bit widths in activations and weights. Our experimental results on ImageNet CNNs show a model size reduction of more than 50%, while maintaining classification accuracy without a need for retraining. We also measured the relationship …

引用总数

被引用次数：4

202220231 3

学术搜索中的文章

Simulating quantized inference on convolutional neural networks

V Finotti, B Albertini - Computers and Electrical Engineering, 2021

被引用次数：4 相关文章所有 2 个版本