Tiny machine learning: progress and futures [feature]

J Lin, L Zhu, WM Chen, WC Wang… - IEEE Circuits and …, 2023 - ieeexplore.ieee.org
Tiny machine learning (TinyML) is a new frontier of machine learning. By squeezing deep
learning models into billions of IoT devices and microcontrollers (MCUs), we expand the …

Enabling resource-efficient aiot system with cross-level optimization: A survey

S Liu, B Guo, C Fang, Z Wang, S Luo… - … Surveys & Tutorials, 2023 - ieeexplore.ieee.org
The emerging field of artificial intelligence of things (AIoT, AI+ IoT) is driven by the
widespread use of intelligent infrastructures and the impressive success of deep learning …

Monarch mixer: A simple sub-quadratic gemm-based architecture

D Fu, S Arora, J Grogan, I Johnson… - Advances in …, 2024 - proceedings.neurips.cc
Abstract Machine learning models are increasingly being scaled in both sequence length
and model dimension to reach longer contexts and better performance. However, existing …

Edgemoe: Fast on-device inference of moe-based large language models

R Yi, L Guo, S Wei, A Zhou, S Wang, M Xu - arXiv preprint arXiv …, 2023 - arxiv.org
Large Language Models (LLMs) such as GPTs and LLaMa have ushered in a revolution in
machine intelligence, owing to their exceptional capabilities in a wide range of machine …

Federated fine-tuning of billion-sized language models across mobile devices

M Xu, Y Wu, D Cai, X Li, S Wang - arXiv preprint arXiv:2308.13894, 2023 - arxiv.org
Large Language Models (LLMs) are transforming the landscape of mobile intelligence.
Federated Learning (FL), a method to preserve user data privacy, is often employed in fine …

Mandheling: Mixed-precision on-device dnn training with dsp offloading

D Xu, M Xu, Q Wang, S Wang, Y Ma, K Huang… - Proceedings of the 28th …, 2022 - dl.acm.org
This paper proposes Mandheling, the first system that enables highly resource-efficient on-
device training by orchestrating mixed-precision training with on-chip Digital Signal …

Fwdllm: Efficient federated finetuning of large language models with perturbed inferences

M Xu, D Cai, Y Wu, X Li, S Wang - … of the 2024 USENIX Conference on …, 2024 - dl.acm.org
Large Language Models (LLMs) are transforming the landscape of mobile intelligence.
Federated Learning (FL), a method to preserve user data privacy, is often employed in fine …

[PDF][PDF] Tinytrain: Deep neural network training at the extreme edge

YD Kwon, R Li, SI Venieris… - arXiv preprint arXiv …, 2023 - theyoungkwon.github.io
On-device training is essential for user personalisation and privacy. With the pervasiveness
of IoT devices and microcontroller units (MCU), this task becomes more challenging due to …

Flashfftconv: Efficient convolutions for long sequences with tensor cores

DY Fu, H Kumbong, E Nguyen, C Ré - arXiv preprint arXiv:2311.05908, 2023 - arxiv.org
Convolution models with long filters have demonstrated state-of-the-art reasoning abilities in
many long-sequence tasks but lag behind the most optimized Transformers in wall-clock …

Cost-effective on-device continual learning over memory hierarchy with Miro

X Ma, S Jeong, M Zhang, D Wang, J Choi… - Proceedings of the 29th …, 2023 - dl.acm.org
Continual learning (CL) trains NN models incrementally from a continuous stream of tasks.
To remember previously learned knowledge, prior studies store old samples over a memory …