A comprehensive survey on model quantization for deep neural networks in image classification

B Rokh, A Azarpeyvand, A Khanteymoori - ACM Transactions on …, 2023 - dl.acm.org
Recent advancements in machine learning achieved by Deep Neural Networks (DNNs)
have been significant. While demonstrating high accuracy, DNNs are associated with a …

Quantization and deployment of deep neural networks on microcontrollers

PE Novac, G Boukli Hacene, A Pegatoquet… - Sensors, 2021 - mdpi.com
Embedding Artificial Intelligence onto low-power devices is a challenging task that has been
partly overcome with recent advances in machine learning and hardware design. Presently …

Fracbits: Mixed precision quantization via fractional bit-widths

L Yang, Q Jin - Proceedings of the AAAI Conference on Artificial …, 2021 - ojs.aaai.org
Abstract Model quantization helps to reduce model size and latency of deep neural
networks. Mixed precision quantization is favorable with customized hardwares supporting …

SDQ: Stochastic differentiable quantization with mixed precision

X Huang, Z Shen, S Li, Z Liu… - International …, 2022 - proceedings.mlr.press
In order to deploy deep models in a computationally efficient manner, model quantization
approaches have been frequently used. In addition, as new hardware that supports various …

Efficient and effective methods for mixed precision neural network quantization for faster, energy-efficient inference

D Bablani, JL Mckinstry, SK Esser… - arXiv preprint arXiv …, 2023 - arxiv.org
For efficient neural network inference, it is desirable to achieve state-of-the-art accuracy with
the simplest networks requiring the least computation, memory, and power. Quantizing …

Layer importance estimation with imprinting for neural network quantization

H Liu, S Elkerdawy, N Ray… - Proceedings of the …, 2021 - openaccess.thecvf.com
Neural network quantization has achieved a high compression rate using a fixed low bit-
width representation of weights and activations while maintaining the accuracy of the high …

A silicon photonic accelerator for convolutional neural networks with heterogeneous quantization

F Sunny, M Nikdast, S Pasricha - … of the Great Lakes Symposium on VLSI …, 2022 - dl.acm.org
Parameter quantization in convolutional neural networks (CNNs) can help generate efficient
models with lower memory footprint and computational complexity. But, homogeneous …

Free bits: Latency optimization of mixed-precision quantized neural networks on the edge

G Rutishauser, F Conti, L Benini - 2023 IEEE 5th International …, 2023 - ieeexplore.ieee.org
Mixed-precision quantization, where a deep neural network's layers are quantized to
different precisions, offers the opportunity to optimize the trade-offs between model size …

A low memory footprint quantized neural network for depth completion of very sparse time-of-flight depth maps

X Jiang, V Cambareri, G Agresti… - Proceedings of the …, 2022 - openaccess.thecvf.com
Sparse active illumination enables precise time-of-flight depth sensing as it maximizes
signal-to-noise ratio for low power budgets. However, depth completion is required to …

Unsupervised ANN-based equalizer and its trainable FPGA implementation

J Ney, V Lauinger, L Schmalen… - 2023 Joint European …, 2023 - ieeexplore.ieee.org
In recent years, communication engineers put strong emphasis on artificial neural network
(ANN)-based algorithms with the aim of increasing the flexibility and autonomy of the system …