Xai-driven knowledge distillation of large language models for efficient deployment on low-resource devices

R Cantini, A Orsino, D Talia - Journal of Big Data, 2024 - Springer
Abstract Large Language Models (LLMs) are characterized by their inherent memory
inefficiency and compute-intensive nature, making them impractical to run on low-resource …

Empowering Compact Language Models with Knowledge Distillation

AR Junaid - Authorea Preprints, 2025 - techrxiv.org
Large language models (LLMs) have revolutionized the field of artificial intelligence,
achieving unprecedented performance in tasks such as text generation, translation, and …

Ddk: Distilling domain knowledge for efficient large language models

J Liu, C Zhang, J Guo, Y Zhang, H Que, K Deng… - arXiv preprint arXiv …, 2024 - arxiv.org
Despite the advanced intelligence abilities of large language models (LLMs) in various
applications, they still face significant computational and storage demands. Knowledge …

Survey on knowledge distillation for large language models: methods, evaluation, and application

C Yang, Y Zhu, W Lu, Y Wang, Q Chen, C Gao… - ACM Transactions on …, 2024 - dl.acm.org
Large Language Models (LLMs) have showcased exceptional capabilities in various
domains, attracting significant interest from both academia and industry. Despite their …

Improving Mathematical Reasoning Capabilities of Small Language Models via Feedback-Driven Distillation

X Zhu, J Li, C Ma, W Wang - arXiv preprint arXiv:2411.14698, 2024 - arxiv.org
Large Language Models (LLMs) demonstrate exceptional reasoning capabilities, often
achieving state-of-the-art performance in various tasks. However, their substantial …

A survey on knowledge distillation of large language models

X Xu, M Li, C Tao, T Shen, R Cheng, J Li, C Xu… - arXiv preprint arXiv …, 2024 - arxiv.org
This survey presents an in-depth exploration of knowledge distillation (KD) techniques
within the realm of Large Language Models (LLMs), spotlighting the pivotal role of KD in …

WisdomBot: Tuning Large Language Models with Artificial Intelligence Knowledge

J Chen, T Wu, W Ji, F Wu - Frontiers of Digital Education, 2024 - Springer
Large language models (LLMs) have emerged as powerful tools in natural language
processing (NLP), showing a promising future of artificial generated intelligence (AGI) …

Sparse Mixture of Experts Language Models Excel in Knowledge Distillation

H Xu, H Liu, W Gong, X Deng, H Wang - CCF International Conference on …, 2024 - Springer
Abstract Knowledge distillation is an effective method for reducing the computational
overhead of large language models. However, recent optimization efforts in distilling large …

Mixed distillation helps smaller language model better reasoning

C Li, Q Chen, L Li, C Wang, Y Li, Z Chen… - arXiv preprint arXiv …, 2023 - arxiv.org
While large language models (LLMs) have demonstrated exceptional performance in recent
natural language processing (NLP) tasks, their deployment poses substantial challenges …

Gkd: A general knowledge distillation framework for large-scale pre-trained language model

S Tan, WL Tam, Y Wang, W Gong, Y Yang… - arXiv preprint arXiv …, 2023 - arxiv.org
Currently, the reduction in the parameter scale of large-scale pre-trained language models
(PLMs) through knowledge distillation has greatly facilitated their widespread deployment …