Mobilellm: Optimizing sub-billion parameter language models for on-device use cases

G Qu, Q Chen, W Wei, Z Lin, X Chen… - … Surveys & Tutorials, 2025 - ieeexplore.ieee.org

On-device large language models (LLMs), referring to running LLMs on edge devices, have
raised considerable interest since they are more cost-effective, latency-efficient, and privacy …

被引用次数：23 相关文章所有 5 个版本

[PDF] arxiv.org

Minicpm-v: A gpt-4v level mllm on your phone

Y Yao, T Yu, A Zhang, C Wang, J Cui, H Zhu… - arXiv preprint arXiv …, 2024 - arxiv.org

The recent surge of Multimodal Large Language Models (MLLMs) has fundamentally
reshaped the landscape of AI research and industry, shedding light on a promising path …

被引用次数：166 相关文章所有 3 个版本

[PDF] arxiv.org

On-device language models: A comprehensive review

J Xu, Z Li, W Chen, Q Wang, X Gao, Q Cai… - arXiv preprint arXiv …, 2024 - arxiv.org

The advent of large language models (LLMs) revolutionized natural language processing
applications, and running LLMs on edge devices has become increasingly attractive for …

被引用次数：11 相关文章所有 3 个版本

[PDF] arxiv.org

Minicpm: Unveiling the potential of small language models with scalable training strategies

S Hu, Y Tu, X Han, C He, G Cui, X Long… - arXiv preprint arXiv …, 2024 - arxiv.org

The burgeoning interest in developing Large Language Models (LLMs) with up to trillion
parameters has been met with concerns regarding resource efficiency and practical …

被引用次数：169 相关文章所有 2 个版本

[PDF] arxiv.org

Melting point: Mobile evaluation of language transformers

S Laskaridis, K Katevas, L Minto… - Proceedings of the 30th …, 2024 - dl.acm.org

Transformers have recently revolutionized the machine learning (ML) landscape, gradually
making their way into everyday tasks and equipping our computers with" sparks of …

被引用次数：11 相关文章所有 3 个版本

[PDF] arxiv.org

A survey on efficient inference for large language models

Z Zhou, X Ning, K Hong, T Fu, J Xu, S Li, Y Lou… - arXiv preprint arXiv …, 2024 - arxiv.org

Large Language Models (LLMs) have attracted extensive attention due to their remarkable
performance across various tasks. However, the substantial computational and memory …

被引用次数：71 相关文章所有 5 个版本

[PDF] arxiv.org

Q-galore: Quantized galore with int4 projection and layer-adaptive low-rank gradients

Z Zhang, A Jaiswal, L Yin, S Liu, J Zhao, Y Tian… - arXiv preprint arXiv …, 2024 - arxiv.org

Training Large Language Models (LLMs) is memory-intensive due to the large number of
parameters and associated optimization states. GaLore, a recent method, reduces memory …

被引用次数：7 相关文章所有 3 个版本

[PDF] acm.org

Large language model supply chain: A research agenda

S Wang, Y Zhao, X Hou, H Wang - ACM Transactions on Software …, 2024 - dl.acm.org

The rapid advancement of large language models (LLMs) has revolutionized artificial
intelligence, introducing unprecedented capabilities in natural language processing and …

被引用次数：7 相关文章所有 3 个版本

[PDF] arxiv.org

Deeploy: Enabling Energy-Efficient Deployment of Small Language Models on Heterogeneous Microcontrollers

M Scherer, L Macan, VJB Jung, P Wiese… - … on Computer-Aided …, 2024 - ieeexplore.ieee.org

With the rise of embodied foundation models (EFMs), most notably small language models
(SLMs), adapting Transformers for the edge applications has become a very active field of …

被引用次数：6 相关文章所有 8 个版本

[PDF] acm.org

Llm for mobile: An initial roadmap

D Chen, Y Liu, M Zhou, Y Zhao, H Wang… - ACM Transactions on …, 2024 - dl.acm.org

When mobile meets LLMs, mobile app users deserve to have more intelligent usage
experiences. For this to happen, we argue that there is a strong need to apply LLMs for the …

被引用次数：4 相关文章所有 4 个版本

高级搜索

QQ 群