Melon: Breaking the memory wall for resource-efficient on-device machine learning

Q Wang, M Xu, C Jin, X Dong, J Yuan, X Jin… - Proceedings of the 20th …, 2022 - dl.acm.org
On-device learning is a promising technique for emerging privacy-preserving machine
learning paradigms. However, through quantitative experiments, we find that commodity …

A survey of resource-efficient llm and multimodal foundation models

M Xu, W Yin, D Cai, R Yi, D Xu, Q Wang, B Wu… - arXiv preprint arXiv …, 2024 - arxiv.org
Large foundation models, including large language models (LLMs), vision transformers
(ViTs), diffusion, and LLM-based multimodal models, are revolutionizing the entire machine …

Mandheling: Mixed-precision on-device dnn training with dsp offloading

D Xu, M Xu, Q Wang, S Wang, Y Ma, K Huang… - Proceedings of the 28th …, 2022 - dl.acm.org
This paper proposes Mandheling, the first system that enables highly resource-efficient on-
device training by orchestrating mixed-precision training with on-chip Digital Signal …

A comprehensive deep learning library benchmark and optimal library selection

Q Zhang, X Che, Y Chen, X Ma, M Xu… - IEEE Transactions …, 2023 - ieeexplore.ieee.org
Deploying deep learning (DL) on mobile devices has been a notable trend in recent years.
To support fast inference of on-device DL, DL libraries play a critical role as algorithms and …

Rethinking mobile AI ecosystem in the LLM era

J Yuan, C Yang, D Cai, S Wang, X Yuan… - arXiv preprint arXiv …, 2023 - arxiv.org
In today's landscape, smartphones have evolved into hubs for hosting a multitude of deep
learning models aimed at local execution. A key realization driving this work is the notable …

A probabilistic approach to blood glucose prediction in type 1 diabetes under meal uncertainties

S Langarica, M Rodriguez-Fernandez… - IEEE Journal of …, 2023 - ieeexplore.ieee.org
Currently, most reliable and commercialized artificial pancreas systems for type 1 diabetes
are hybrid closed-loop systems, which require the user to announce every meal and its size …

Federated neural architecture search

J Yuan, M Xu, Y Zhao, K Bian, G Huang, X Liu… - arXiv preprint arXiv …, 2020 - arxiv.org
To preserve user privacy while enabling mobile intelligence, techniques have been
proposed to train deep neural networks on decentralized data. However, training over …

Decentralized Cooperative Caching and Offloading for Virtual Reality Task based on GAN-Powered Multi-Agent Reinforcement Learning

Y Yang, L Feng, Y Sun, Y Li, F Zhou… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
As a critical and prevalent service in future mobile networks, virtual reality (VR) is latency-
sensitive and power-hungry, bringing out the optimization problem of trade-off among power …

Characterizing and understanding end-to-end multi-modal neural networks on GPUs

X Hou, C Xu, J Liu, X Tang, L Sun, C Li… - IEEE Computer …, 2022 - ieeexplore.ieee.org
Multi-modal neural networks have become increasingly pervasive in many machine learning
application domains due to their superior accuracy by fusing various modalities. However …

Exploring the Impact of In-Browser Deep Learning Inference on Quality of User Experience and Performance

Q Wang, S Jiang, Z Chen, X Cao, Y Li, A Li… - arXiv preprint arXiv …, 2024 - arxiv.org
Deep Learning (DL) is increasingly being integrated into Web applications through a
method known as" in-browser inference", where the DL processes occur directly within Web …