Efficient Pruning of Large Language Model with Adaptive Estimation Fusion

J Liu, C Wu, C Yang, H Tang, Z Kong, G Yuan… - arXiv preprint arXiv …, 2024 - arxiv.org
Large language models (LLMs) have become crucial for many generative downstream
tasks, leading to an inevitable trend and significant challenge to deploy them efficiently on …

Optimization-based Structural Pruning for Large Language Models without Back-Propagation

Y Gao, Z Liu, W Zhang, B Du, GS Xia - arXiv preprint arXiv:2406.10576, 2024 - arxiv.org
Compared to the moderate size of neural network models, structural weight pruning on the
Large-Language Models (LLMs) imposes a novel challenge on the efficiency of the pruning …

HESSO: Towards Automatic Efficient and User Friendly Any Neural Network Training and Pruning

T Chen, X Qu, D Aponte, C Banbury, J Ko… - arXiv preprint arXiv …, 2024 - arxiv.org
Structured pruning is one of the most popular approaches to effectively compress the heavy
deep neural networks (DNNs) into compact sub-networks while retaining performance. The …

[PDF][PDF] Transformer 의개별가지치기를이용한효율적인이미지캡셔닝기법

권오설 - 방송공학회논문지, 2024 - ksbe-jbe.org
요 약본 논문에서는 이미지 캡셔닝에서 개별 가지치기 기법을 통해 효율적인 트랜스포머
네트워크를 제안한다. 일반적으로 이미지 캡션모델은 사전 학습된 CNN 인코더, 트랜스포머 …