Model compression and hardware acceleration for neural networks: A comprehensive survey

L Deng, G Li, S Han, L Shi, Y Xie - Proceedings of the IEEE, 2020 - ieeexplore.ieee.org
Domain-specific hardware is becoming a promising topic in the backdrop of improvement
slow down for general-purpose processors due to the foreseeable end of Moore's Law …

Structured pruning for deep convolutional neural networks: A survey

Y He, L Xiao - IEEE transactions on pattern analysis and …, 2023 - ieeexplore.ieee.org
The remarkable performance of deep Convolutional neural networks (CNNs) is generally
attributed to their deeper and wider architectures, which can come with significant …

Spvit: Enabling faster vision transformers via latency-aware soft token pruning

Z Kong, P Dong, X Ma, X Meng, W Niu, M Sun… - European conference on …, 2022 - Springer
Abstract Recently, Vision Transformer (ViT) has continuously established new milestones in
the computer vision field, while the high computation and memory cost makes its …

Chamnet: Towards efficient network design through platform-aware model adaptation

X Dai, P Zhang, B Wu, H Yin, F Sun… - Proceedings of the …, 2019 - openaccess.thecvf.com
This paper proposes an efficient neural network (NN) architecture design methodology
called Chameleon that honors given resource constraints. Instead of developing new …

Interpreting cnns via decision trees

Q Zhang, Y Yang, H Ma, YN Wu - Proceedings of the IEEE …, 2019 - openaccess.thecvf.com
This paper aims to quantitatively explain the rationales of each prediction that is made by a
pre-trained convolutional neural network (CNN). We propose to learn a decision tree, which …

Mest: Accurate and fast memory-economic sparse training framework on the edge

G Yuan, X Ma, W Niu, Z Li, Z Kong… - Advances in …, 2021 - proceedings.neurips.cc
Recently, a new trend of exploring sparsity for accelerating neural network training has
emerged, embracing the paradigm of training on the edge. This paper proposes a novel …

Chex: Channel exploration for cnn model compression

Z Hou, M Qin, F Sun, X Ma, K Yuan… - Proceedings of the …, 2022 - openaccess.thecvf.com
Channel pruning has been broadly recognized as an effective technique to reduce the
computation and memory cost of deep convolutional neural networks. However …

Autocompress: An automatic dnn structured pruning framework for ultra-high compression rates

N Liu, X Ma, Z Xu, Y Wang, J Tang, J Ye - Proceedings of the AAAI …, 2020 - ojs.aaai.org
Structured weight pruning is a representative model compression technique of DNNs to
reduce the storage and computation requirements and accelerate inference. An automatic …

Adversarial robustness vs. model compression, or both?

S Ye, K Xu, S Liu, H Cheng… - Proceedings of the …, 2019 - openaccess.thecvf.com
It is well known that deep neural networks (DNNs) are vulnerable to adversarial attacks,
which are implemented by adding crafted perturbations onto benign examples. Min-max …

Pruning Networks With Cross-Layer Ranking & k-Reciprocal Nearest Filters

M Lin, L Cao, Y Zhang, L Shao… - IEEE transactions on …, 2022 - ieeexplore.ieee.org
This article focuses on filter-level network pruning. A novel pruning method, termed CLR-
RNF, is proposed. We first reveal a “long-tail” pruning problem in magnitude-based weight …