A survey on deep neural network pruning: Taxonomy, comparison, analysis, and recommendations

H Cheng, M Zhang, JQ Shi - IEEE Transactions on Pattern …, 2024 - ieeexplore.ieee.org
Modern deep neural networks, particularly recent large language models, come with
massive model sizes that require significant computational and storage resources. To …

A survey of techniques for optimizing transformer inference

KT Chitty-Venkata, S Mittal, M Emani… - Journal of Systems …, 2023 - Elsevier
Recent years have seen a phenomenal rise in the performance and applications of
transformer neural networks. The family of transformer networks, including Bidirectional …

Flexivit: One model for all patch sizes

L Beyer, P Izmailov, A Kolesnikov… - Proceedings of the …, 2023 - openaccess.thecvf.com
Vision Transformers convert images to sequences by slicing them into patches. The size of
these patches controls a speed/accuracy tradeoff, with smaller patches leading to higher …

Loraprune: Pruning meets low-rank parameter-efficient fine-tuning

M Zhang, H Chen, C Shen, Z Yang, L Ou, X Yu… - arXiv preprint arXiv …, 2023 - arxiv.org
Large pre-trained models (LPMs), such as LLaMA and GLM, have shown exceptional
performance across various tasks through fine-tuning. Although low-rank adaption (LoRA) …

X-pruner: explainable pruning for vision transformers

L Yu, W Xiang - Proceedings of the IEEE/CVF conference …, 2023 - openaccess.thecvf.com
Recently vision transformer models have become prominent models for a range of tasks.
These models, however, usually suffer from intensive computational costs and heavy …

Efficient railway track region segmentation algorithm based on lightweight neural network and cross-fusion decoder

Z Chen, J Yang, L Chen, Z Feng, L Jia - Automation in Construction, 2023 - Elsevier
To segment railway track regions in real-time for intrusion detection and improving security,
this paper proposes an efficient railway track region segmentation network (ERTNet) based …

Improving dynamic hdr imaging with fusion transformer

R Chen, B Zheng, H Zhang, Q Chen, C Yan… - Proceedings of the …, 2023 - ojs.aaai.org
Abstract Reconstructing a High Dynamic Range (HDR) image from several Low Dynamic
Range (LDR) images with different exposures is a challenging task, especially in the …

A survey on efficient vision transformers: algorithms, techniques, and performance benchmarking

L Papa, P Russo, I Amerini… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Vision Transformer (ViT) architectures are becoming increasingly popular and widely
employed to tackle computer vision applications. Their main feature is the capacity to extract …

Deep compression of pre-trained transformer models

N Wang, CCC Liu, S Venkataramani… - Advances in …, 2022 - proceedings.neurips.cc
Pre-trained transformer models have achieved remarkable success in natural language
processing (NLP) and have recently become competitive alternatives to Convolution Neural …

Can Unstructured Pruning Reduce the Depth in Deep Neural Networks?

Z Liao, V Quétu, VT Nguyen… - Proceedings of the …, 2023 - openaccess.thecvf.com
Pruning is a widely used technique for reducing the size of deep neural networks while
maintaining their performance. However, such a technique, despite being able to massively …