Melon: Breaking the memory wall for resource-efficient on-device machine learning

J Lin, L Zhu, WM Chen, WC Wang… - IEEE Circuits and …, 2023 - ieeexplore.ieee.org

Tiny machine learning (TinyML) is a new frontier of machine learning. By squeezing deep
learning models into billions of IoT devices and microcontrollers (MCUs), we expand the …

被引用次数：55 相关文章所有 3 个版本

[PDF] arxiv.org

Enabling resource-efficient aiot system with cross-level optimization: A survey

S Liu, B Guo, C Fang, Z Wang, S Luo… - … Surveys & Tutorials, 2023 - ieeexplore.ieee.org

The emerging field of artificial intelligence of things (AIoT, AI+ IoT) is driven by the
widespread use of intelligent infrastructures and the impressive success of deep learning …

被引用次数：29 相关文章所有 6 个版本

[PDF] neurips.cc

Monarch mixer: A simple sub-quadratic gemm-based architecture

D Fu, S Arora, J Grogan, I Johnson… - Advances in …, 2024 - proceedings.neurips.cc

Abstract Machine learning models are increasingly being scaled in both sequence length
and model dimension to reach longer contexts and better performance. However, existing …

被引用次数：42 相关文章所有 6 个版本

[PDF] arxiv.org

Edgemoe: Fast on-device inference of moe-based large language models

R Yi, L Guo, S Wei, A Zhou, S Wang, M Xu - arXiv preprint arXiv …, 2023 - arxiv.org

Large Language Models (LLMs) such as GPTs and LLaMa have ushered in a revolution in
machine intelligence, owing to their exceptional capabilities in a wide range of machine …

被引用次数：60 相关文章所有 2 个版本

[PDF] arxiv.org

Federated fine-tuning of billion-sized language models across mobile devices

M Xu, Y Wu, D Cai, X Li, S Wang - arXiv preprint arXiv:2308.13894, 2023 - arxiv.org

Large Language Models (LLMs) are transforming the landscape of mobile intelligence.
Federated Learning (FL), a method to preserve user data privacy, is often employed in fine …

被引用次数：23 相关文章所有 3 个版本

[PDF] arxiv.org

Mandheling: Mixed-precision on-device dnn training with dsp offloading

D Xu, M Xu, Q Wang, S Wang, Y Ma, K Huang… - Proceedings of the 28th …, 2022 - dl.acm.org

This paper proposes Mandheling, the first system that enables highly resource-efficient on-
device training by orchestrating mixed-precision training with on-chip Digital Signal …

被引用次数：50 相关文章所有 6 个版本

Fwdllm: Efficient federated finetuning of large language models with perturbed inferences

M Xu, D Cai, Y Wu, X Li, S Wang - … of the 2024 USENIX Conference on …, 2024 - dl.acm.org

Large Language Models (LLMs) are transforming the landscape of mobile intelligence.
Federated Learning (FL), a method to preserve user data privacy, is often employed in fine …

被引用次数：10 相关文章所有 3 个版本

[PDF] github.io

[PDF][PDF] Tinytrain: Deep neural network training at the extreme edge

YD Kwon, R Li, SI Venieris… - arXiv preprint arXiv …, 2023 - theyoungkwon.github.io

On-device training is essential for user personalisation and privacy. With the pervasiveness
of IoT devices and microcontroller units (MCU), this task becomes more challenging due to …

被引用次数：20 相关文章所有 3 个版本

[PDF] arxiv.org

Flashfftconv: Efficient convolutions for long sequences with tensor cores

DY Fu, H Kumbong, E Nguyen, C Ré - arXiv preprint arXiv:2311.05908, 2023 - arxiv.org

Convolution models with long filters have demonstrated state-of-the-art reasoning abilities in
many long-sequence tasks but lag behind the most optimized Transformers in wall-clock …

被引用次数：16 相关文章所有 4 个版本

[PDF] arxiv.org

Cost-effective on-device continual learning over memory hierarchy with Miro

X Ma, S Jeong, M Zhang, D Wang, J Choi… - Proceedings of the 29th …, 2023 - dl.acm.org

Continual learning (CL) trains NN models incrementally from a continuous stream of tasks.
To remember previously learned knowledge, prior studies store old samples over a memory …

被引用次数：11 相关文章所有 7 个版本

高级搜索

QQ 群