Model compression and hardware acceleration for neural networks: A comprehensive survey

L Deng, G Li, S Han, L Shi, Y Xie - Proceedings of the IEEE, 2020 - ieeexplore.ieee.org
Domain-specific hardware is becoming a promising topic in the backdrop of improvement
slow down for general-purpose processors due to the foreseeable end of Moore's Law …

Efficient acceleration of deep learning inference on resource-constrained edge devices: A review

MMH Shuvo, SK Islam, J Cheng… - Proceedings of the …, 2022 - ieeexplore.ieee.org
Successful integration of deep neural networks (DNNs) or deep learning (DL) has resulted
in breakthroughs in many areas. However, deploying these highly accurate models for data …

Chip: Channel independence-based pruning for compact neural networks

Y Sui, M Yin, Y Xie, H Phan… - Advances in Neural …, 2021 - proceedings.neurips.cc
Filter pruning has been widely used for neural network compression because of its enabled
practical acceleration. To date, most of the existing filter pruning works explore the …

Sigma: A sparse and irregular gemm accelerator with flexible interconnects for dnn training

E Qin, A Samajdar, H Kwon, V Nadella… - … Symposium on High …, 2020 - ieeexplore.ieee.org
The advent of Deep Learning (DL) has radically transformed the computing industry across
the entire spectrum from algorithms to circuits. As myriad application domains embrace DL, it …

SparTen: A sparse tensor accelerator for convolutional neural networks

A Gondimalla, N Chesnut, M Thottethodi… - Proceedings of the …, 2019 - dl.acm.org
Convolutional neural networks (CNNs) are emerging as powerful tools for image
processing. Recent machine learning work has reduced CNNs' compute and data volumes …

Sanger: A co-design framework for enabling sparse attention using reconfigurable architecture

L Lu, Y Jin, H Bi, Z Luo, P Li, T Wang… - MICRO-54: 54th Annual …, 2021 - dl.acm.org
In recent years, attention-based models have achieved impressive performance in natural
language processing and computer vision applications by effectively capturing contextual …

An efficient hardware accelerator for sparse convolutional neural networks on FPGAs

L Lu, J Xie, R Huang, J Zhang, W Lin… - 2019 IEEE 27th Annual …, 2019 - ieeexplore.ieee.org
Deep convolutional neural networks (CNN) have achieved remarkable performance with the
cost of huge computation. As the CNN model becomes more complex and deeper …

An efficient hardware accelerator for structured sparse convolutional neural networks on FPGAs

C Zhu, K Huang, S Yang, Z Zhu… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org
Deep convolutional neural networks (CNNs) have achieved state-of-the-art performance in a
wide range of applications. However, deeper CNN models, which are usually computation …

GoSPA: An energy-efficient high-performance globally optimized sparse convolutional neural network accelerator

C Deng, Y Sui, S Liao, X Qian… - 2021 ACM/IEEE 48th …, 2021 - ieeexplore.ieee.org
The co-existence of activation sparsity and model sparsity in convolutional neural network
(CNN) models makes sparsity-aware CNN hardware designs very attractive. The existing …

Hypar: Towards hybrid parallelism for deep learning accelerator array

L Song, J Mao, Y Zhuo, X Qian, H Li… - 2019 IEEE international …, 2019 - ieeexplore.ieee.org
With the rise of artificial intelligence in recent years, Deep Neural Networks (DNNs) have
been widely used in many domains. To achieve high performance and energy efficiency …