- 学术资源搜索

Model compression and hardware acceleration for neural networks: A comprehensive survey

L Deng, G Li, S Han, L Shi, Y Xie - Proceedings of the IEEE, 2020 - ieeexplore.ieee.org

Domain-specific hardware is becoming a promising topic in the backdrop of improvement
slow down for general-purpose processors due to the foreseeable end of Moore's Law …

被引用次数：964 相关文章所有 2 个版本

A comprehensive survey on model compression and acceleration

T Choudhary, V Mishra, A Goswami… - Artificial Intelligence …, 2020 - Springer

In recent years, machine learning (ML) and deep learning (DL) have shown remarkable
improvement in computer vision, natural language processing, stock prediction, forecasting …

被引用次数：521 相关文章所有 8 个版本

[PDF] arxiv.org

Pruning and quantization for deep neural network acceleration: A survey

T Liang, J Glossner, L Wang, S Shi, X Zhang - Neurocomputing, 2021 - Elsevier

Deep neural networks have been applied in many applications exhibiting extraordinary
abilities in the field of computer vision. However, complex network architectures challenge …

被引用次数：825 相关文章所有 6 个版本

[PDF] thecvf.com

Differentiable soft quantization: Bridging full-precision and low-bit neural networks

R Gong, X Liu, S Jiang, T Li, P Hu… - Proceedings of the …, 2019 - openaccess.thecvf.com

Hardware-friendly network quantization (eg, binary/uniform quantization) can efficiently
accelerate the inference and meanwhile reduce memory consumption of the deep neural …

被引用次数：550 相关文章所有 12 个版本

[PDF] arxiv.org

Pact: Parameterized clipping activation for quantized neural networks

J Choi, Z Wang, S Venkataramani, PIJ Chuang… - arXiv preprint arXiv …, 2018 - arxiv.org

Deep learning algorithms achieve high classification accuracy at the expense of significant
computation cost. To address this cost, a number of quantization schemes have been …

被引用次数：1114 相关文章所有 5 个版本

[PDF] thecvf.com

Learning to quantize deep networks by optimizing quantization intervals with task loss

S Jung, C Son, S Lee, J Son, JJ Han… - Proceedings of the …, 2019 - openaccess.thecvf.com

Reducing bit-widths of activations and weights of deep networks makes it efficient to
compute and store them in memory, which is crucial in their deployments to resource-limited …

被引用次数：457 相关文章所有 8 个版本

[PDF] mlsys.org

Accurate and efficient 2-bit quantized neural networks

J Choi, S Venkataramani… - Proceedings of …, 2019 - proceedings.mlsys.org

Deep learning algorithms achieve high classification accuracy at the expense of significant
computation cost. In order to reduce this cost, several quantization schemes have gained …

被引用次数：210 相关文章所有 4 个版本

[PDF] arxiv.org

Compression of deep learning models for text: A survey

M Gupta, P Agrawal - ACM Transactions on Knowledge Discovery from …, 2022 - dl.acm.org

In recent years, the fields of natural language processing (NLP) and information retrieval (IR)
have made tremendous progress thanks to deep learning models like Recurrent Neural …

被引用次数：125 相关文章所有 5 个版本

[PDF] thecvf.com

Adabits: Neural network quantization with adaptive bit-widths

Q Jin, L Yang, Z Liao - … of the IEEE/CVF Conference on …, 2020 - openaccess.thecvf.com

Deep neural networks with adaptive configurations have gained increasing attention due to
the instant and flexible deployment of these models on platforms with different resource …

被引用次数：151 相关文章所有 7 个版本

Energy-efficient neural network accelerator based on outlier-aware low-precision computation

E Park, D Kim, S Yoo - 2018 ACM/IEEE 45th Annual …, 2018 - ieeexplore.ieee.org

Owing to the presence of large values, which we call outliers, conventional methods of
quantization fail to achieve significantly low precision, eg, four bits, for very deep neural …

被引用次数：218 相关文章所有 3 个版本

高级搜索

QQ 群

Model compression and hardware acceleration for neural networks: A comprehensive survey

A comprehensive survey on model compression and acceleration

Pruning and quantization for deep neural network acceleration: A survey

Differentiable soft quantization: Bridging full-precision and low-bit neural networks

Pact: Parameterized clipping activation for quantized neural networks

Learning to quantize deep networks by optimizing quantization intervals with task loss

Accurate and efficient 2-bit quantized neural networks

Compression of deep learning models for text: A survey

Adabits: Neural network quantization with adaptive bit-widths

Energy-efficient neural network accelerator based on outlier-aware low-precision computation

引用