A survey on efficient convolutional neural networks and hardware acceleration

D Ghimire, D Kil, S Kim - Electronics, 2022 - mdpi.com
Over the past decade, deep-learning-based representations have demonstrated remarkable
performance in academia and industry. The learning capability of convolutional neural …

Hardware approximate techniques for deep neural network accelerators: A survey

G Armeniakos, G Zervakis, D Soudris… - ACM Computing …, 2022 - dl.acm.org
Deep Neural Networks (DNNs) are very popular because of their high performance in
various cognitive tasks in Machine Learning (ML). Recent advancements in DNNs have …

Optimal brain compression: A framework for accurate post-training quantization and pruning

E Frantar, D Alistarh - Advances in Neural Information …, 2022 - proceedings.neurips.cc
We consider the problem of model compression for deep neural networks (DNNs) in the
challenging one-shot/post-training setting, in which we are given an accurate trained model …

A survey of quantization methods for efficient neural network inference

A Gholami, S Kim, Z Dong, Z Yao… - Low-Power Computer …, 2022 - taylorfrancis.com
This chapter provides approaches to the problem of quantizing the numerical values in deep
Neural Network computations, covering the advantages/disadvantages of current methods …

A white paper on neural network quantization

M Nagel, M Fournarakis, RA Amjad… - arXiv preprint arXiv …, 2021 - arxiv.org
While neural networks have advanced the frontiers in many applications, they often come at
a high computational cost. Reducing the power and latency of neural network inference is …

Pruning and quantization for deep neural network acceleration: A survey

T Liang, J Glossner, L Wang, S Shi, X Zhang - Neurocomputing, 2021 - Elsevier
Deep neural networks have been applied in many applications exhibiting extraordinary
abilities in the field of computer vision. However, complex network architectures challenge …

Post-training quantization for vision transformer

Z Liu, Y Wang, K Han, W Zhang… - Advances in Neural …, 2021 - proceedings.neurips.cc
Recently, transformer has achieved remarkable performance on a variety of computer vision
applications. Compared with mainstream convolutional neural networks, vision transformers …

Quantizable transformers: Removing outliers by helping attention heads do nothing

Y Bondarenko, M Nagel… - Advances in Neural …, 2023 - proceedings.neurips.cc
Transformer models have been widely adopted in various domains over the last years and
especially large language models have advanced the field of AI significantly. Due to their …

Post-training quantization on diffusion models

Y Shang, Z Yuan, B Xie, B Wu… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Denoising diffusion (score-based) generative models have recently achieved significant
accomplishments in generating realistic and diverse data. These approaches define a …

Brecq: Pushing the limit of post-training quantization by block reconstruction

Y Li, R Gong, X Tan, Y Yang, P Hu, Q Zhang… - arXiv preprint arXiv …, 2021 - arxiv.org
We study the challenging task of neural network quantization without end-to-end retraining,
called Post-training Quantization (PTQ). PTQ usually requires a small subset of training data …