Dsconv: Efficient convolution operator

MG Nascimento, R Fawcett… - Proceedings of the …, 2019 - openaccess.thecvf.com
Quantization is a popular way of increasing the speed and lowering the memory usage of
Convolution Neural Networks (CNNs). When labelled training data is available, network …

Post training 4-bit quantization of convolutional networks for rapid-deployment

R Banner, Y Nahshan… - Advances in Neural …, 2019 - proceedings.neurips.cc
Convolutional neural networks require significant memory bandwidth and storage for
intermediate computations, apart from substantial computing resources. Neural network …

Explicit loss-error-aware quantization for low-bit deep neural networks

A Zhou, A Yao, K Wang, Y Chen - Proceedings of the IEEE …, 2018 - openaccess.thecvf.com
Benefiting from tens of millions of hierarchically stacked learnable parameters, Deep Neural
Networks (DNNs) have demonstrated overwhelming accuracy on a variety of artificial …

Value-aware quantization for training and inference of neural networks

E Park, S Yoo, P Vajda - Proceedings of the European …, 2018 - openaccess.thecvf.com
We propose a novel value-aware quantization which applies aggressively reduced
precision to the majority of data while separately handling a small amount of large data in …

MXQN: Mixed quantization for reducing bit-width of weights and activations in deep convolutional neural networks

C Huang, P Liu, L Fang - Applied Intelligence, 2021 - Springer
Quantization, which involves bit-width reduction, is considered as one of the most effective
approaches to rapidly and energy-efficiently deploy deep convolutional neural networks …

Fixed-point quantization of convolutional neural networks for quantized inference on embedded platforms

R Goyal, J Vanschoren, V Van Acht… - arXiv preprint arXiv …, 2021 - arxiv.org
Convolutional Neural Networks (CNNs) have proven to be a powerful state-of-the-art
method for image classification tasks. One drawback however is the high computational …

Bit efficient quantization for deep neural networks

P Nayak, D Zhang, S Chai - 2019 Fifth Workshop on Energy …, 2019 - ieeexplore.ieee.org
Quantization for deep neural networks have afforded models for edge devices that use less
on-board memory and enable efficient low-power inference. In this paper, we present a …

Lsq+: Improving low-bit quantization through learnable offsets and better initialization

Y Bhalgat, J Lee, M Nagel… - Proceedings of the …, 2020 - openaccess.thecvf.com
Unlike ReLU, newer activation functions (like Swish, H-swish, Mish) that are frequently
employed in popular efficient architectures can also result in negative activation values, with …

Adabits: Neural network quantization with adaptive bit-widths

Q Jin, L Yang, Z Liao - … of the IEEE/CVF Conference on …, 2020 - openaccess.thecvf.com
Deep neural networks with adaptive configurations have gained increasing attention due to
the instant and flexible deployment of these models on platforms with different resource …

Least squares binary quantization of neural networks

H Pouransari, Z Tu, O Tuzel - Proceedings of the IEEE/CVF …, 2020 - openaccess.thecvf.com
Quantizing weights and activations of deep neural networks results in significant
improvement in inference efficiency at the cost of lower accuracy. A source of the accuracy …