Adabits: Neural network quantization with adaptive bit-widths

Q Jin, L Yang, Z Liao - … of the IEEE/CVF Conference on …, 2020 - openaccess.thecvf.com
Deep neural networks with adaptive configurations have gained increasing attention due to
the instant and flexible deployment of these models on platforms with different resource …

Improving neural network efficiency via post-training quantization with adaptive floating-point

F Liu, W Zhao, Z He, Y Wang, Z Wang… - Proceedings of the …, 2021 - openaccess.thecvf.com
Abstract Model quantization has emerged as a mandatory technique for efficient inference
with advanced Deep Neural Networks (DNN). It converts the model parameters in full …

Improving low-precision network quantization via bin regularization

T Han, D Li, J Liu, L Tian… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
Abstract Model quantization is an important mechanism for energy-efficient deployment of
deep neural networks on resource-constrained devices by reducing the bit precision of …

Hawq: Hessian aware quantization of neural networks with mixed-precision

Z Dong, Z Yao, A Gholami… - Proceedings of the …, 2019 - openaccess.thecvf.com
Abstract Model size and inference speed/power have become a major challenge in the
deployment of neural networks for many applications. A promising approach to address …

Low-bit quantization of neural networks for efficient inference

Y Choukroun, E Kravchik, F Yang… - 2019 IEEE/CVF …, 2019 - ieeexplore.ieee.org
Recent machine learning methods use increasingly large deep neural networks to achieve
state of the art results in various tasks. The gains in performance come at the cost of a …

Lsq+: Improving low-bit quantization through learnable offsets and better initialization

Y Bhalgat, J Lee, M Nagel… - Proceedings of the …, 2020 - openaccess.thecvf.com
Unlike ReLU, newer activation functions (like Swish, H-swish, Mish) that are frequently
employed in popular efficient architectures can also result in negative activation values, with …

Overcoming oscillations in quantization-aware training

M Nagel, M Fournarakis… - International …, 2022 - proceedings.mlr.press
When training neural networks with simulated quantization, we observe that quantized
weights can, rather unexpectedly, oscillate between two grid-points. The importance of this …

Data-free quantization through weight equalization and bias correction

M Nagel, M Baalen, T Blankevoort… - Proceedings of the …, 2019 - openaccess.thecvf.com
We introduce a data-free quantization method for deep neural networks that does not
require fine-tuning or hyperparameter selection. It achieves near-original model …

Dsconv: Efficient convolution operator

MG Nascimento, R Fawcett… - Proceedings of the …, 2019 - openaccess.thecvf.com
Quantization is a popular way of increasing the speed and lowering the memory usage of
Convolution Neural Networks (CNNs). When labelled training data is available, network …

Pact: Parameterized clipping activation for quantized neural networks

J Choi, Z Wang, S Venkataramani, PIJ Chuang… - arXiv preprint arXiv …, 2018 - arxiv.org
Deep learning algorithms achieve high classification accuracy at the expense of significant
computation cost. To address this cost, a number of quantization schemes have been …