Learned step size quantization

SK Esser, JL McKinstry, D Bablani… - arXiv preprint arXiv …, 2019 - arxiv.org
Deep networks run with low precision operations at inference time offer power and space
advantages over high precision alternatives, but need to overcome the challenge of …

Replacing mobile camera isp with a single deep learning model

A Ignatov, L Van Gool… - Proceedings of the IEEE …, 2020 - openaccess.thecvf.com
As the popularity of mobile photography is growing constantly, lots of efforts are being
invested now into building complex hand-crafted camera ISP solutions. In this work, we …

Loss aware post-training quantization

Y Nahshan, B Chmiel, C Baskin, E Zheltonozhskii… - Machine Learning, 2021 - Springer
Neural network quantization enables the deployment of large models on resource-
constrained devices. Current post-training quantization methods fall short in terms of …

Trained quantization thresholds for accurate and efficient fixed-point inference of deep neural networks

S Jain, A Gural, M Wu, C Dick - Proceedings of Machine …, 2020 - proceedings.mlsys.org
We propose a method of training quantization thresholds (TQT) for uniform symmetric
quantizers using standard backpropagation and gradient descent. Contrary to prior work, we …

Mixed-precision neural network quantization via learned layer-wise importance

C Tang, K Ouyang, Z Wang, Y Zhu, W Ji… - … on Computer Vision, 2022 - Springer
The exponentially large discrete search space in mixed-precision quantization (MPQ) makes
it hard to determine the optimal bit-width for each layer. Previous works usually resort to …

Improving neural network efficiency via post-training quantization with adaptive floating-point

F Liu, W Zhao, Z He, Y Wang, Z Wang… - Proceedings of the …, 2021 - openaccess.thecvf.com
Abstract Model quantization has emerged as a mandatory technique for efficient inference
with advanced Deep Neural Networks (DNN). It converts the model parameters in full …

Robust quantization: One model to rule them all

B Chmiel, R Banner, G Shomron… - Advances in neural …, 2020 - proceedings.neurips.cc
Neural network quantization methods often involve simulating the quantization process
during training, making the trained model highly dependent on the target bit-width and …

Hmq: Hardware friendly mixed precision quantization block for cnns

HV Habi, RH Jennings, A Netzer - … , Glasgow, UK, August 23–28, 2020 …, 2020 - Springer
Recent work in network quantization produced state-of-the-art results using mixed precision
quantization. An imperative requirement for many efficient edge device hardware …

Improving low-precision network quantization via bin regularization

T Han, D Li, J Liu, L Tian… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
Abstract Model quantization is an important mechanism for energy-efficient deployment of
deep neural networks on resource-constrained devices by reducing the bit precision of …

Dynamic precision analog computing for neural networks

S Garg, J Lou, A Jain, Z Guo, BJ Shastri… - IEEE Journal of …, 2022 - ieeexplore.ieee.org
Analog electronic and optical computing exhibit tremendous advantages over digital
computing for accelerating deep learning when operations are executed at low precision …