Mobile edge intelligence for large language models: A contemporary survey

G Qu, Q Chen, W Wei, Z Lin, X Chen… - … Surveys & Tutorials, 2025 - ieeexplore.ieee.org
On-device large language models (LLMs), referring to running LLMs on edge devices, have
raised considerable interest since they are more cost-effective, latency-efficient, and privacy …

Minicpm-v: A gpt-4v level mllm on your phone

Y Yao, T Yu, A Zhang, C Wang, J Cui, H Zhu… - arXiv preprint arXiv …, 2024 - arxiv.org
The recent surge of Multimodal Large Language Models (MLLMs) has fundamentally
reshaped the landscape of AI research and industry, shedding light on a promising path …

On-device language models: A comprehensive review

J Xu, Z Li, W Chen, Q Wang, X Gao, Q Cai… - arXiv preprint arXiv …, 2024 - arxiv.org
The advent of large language models (LLMs) revolutionized natural language processing
applications, and running LLMs on edge devices has become increasingly attractive for …

Minicpm: Unveiling the potential of small language models with scalable training strategies

S Hu, Y Tu, X Han, C He, G Cui, X Long… - arXiv preprint arXiv …, 2024 - arxiv.org
The burgeoning interest in developing Large Language Models (LLMs) with up to trillion
parameters has been met with concerns regarding resource efficiency and practical …

Melting point: Mobile evaluation of language transformers

S Laskaridis, K Katevas, L Minto… - Proceedings of the 30th …, 2024 - dl.acm.org
Transformers have recently revolutionized the machine learning (ML) landscape, gradually
making their way into everyday tasks and equipping our computers with" sparks of …

A survey on efficient inference for large language models

Z Zhou, X Ning, K Hong, T Fu, J Xu, S Li, Y Lou… - arXiv preprint arXiv …, 2024 - arxiv.org
Large Language Models (LLMs) have attracted extensive attention due to their remarkable
performance across various tasks. However, the substantial computational and memory …

Q-galore: Quantized galore with int4 projection and layer-adaptive low-rank gradients

Z Zhang, A Jaiswal, L Yin, S Liu, J Zhao, Y Tian… - arXiv preprint arXiv …, 2024 - arxiv.org
Training Large Language Models (LLMs) is memory-intensive due to the large number of
parameters and associated optimization states. GaLore, a recent method, reduces memory …

Large language model supply chain: A research agenda

S Wang, Y Zhao, X Hou, H Wang - ACM Transactions on Software …, 2024 - dl.acm.org
The rapid advancement of large language models (LLMs) has revolutionized artificial
intelligence, introducing unprecedented capabilities in natural language processing and …

Deeploy: Enabling Energy-Efficient Deployment of Small Language Models on Heterogeneous Microcontrollers

M Scherer, L Macan, VJB Jung, P Wiese… - … on Computer-Aided …, 2024 - ieeexplore.ieee.org
With the rise of embodied foundation models (EFMs), most notably small language models
(SLMs), adapting Transformers for the edge applications has become a very active field of …

Llm for mobile: An initial roadmap

D Chen, Y Liu, M Zhou, Y Zhao, H Wang… - ACM Transactions on …, 2024 - dl.acm.org
When mobile meets LLMs, mobile app users deserve to have more intelligent usage
experiences. For this to happen, we argue that there is a strong need to apply LLMs for the …