Efficient acceleration of deep learning inference on resource-constrained edge devices: A review

MMH Shuvo, SK Islam, J Cheng… - Proceedings of the …, 2022 - ieeexplore.ieee.org
Successful integration of deep neural networks (DNNs) or deep learning (DL) has resulted
in breakthroughs in many areas. However, deploying these highly accurate models for data …

Deep learning in mobile and wireless networking: A survey

C Zhang, P Patras, H Haddadi - IEEE Communications surveys …, 2019 - ieeexplore.ieee.org
The rapid uptake of mobile devices and the rising popularity of mobile applications and
services pose unprecedented demands on mobile and wireless networking infrastructure …

What does ChatGPT say: The DAO from algorithmic intelligence to linguistic intelligence

FY Wang, Q Miao, X Li, X Wang… - IEEE/CAA Journal of …, 2023 - ieeexplore.ieee.org
The well-known ancient Chinese philosopher Lao Tzu (老子) or Laozi (6th∼ 4th century BC
during the Spring and Autumn period) started his classic Tao Teh Ching《 道德经》 or Dao De …

On-device training under 256kb memory

J Lin, L Zhu, WM Chen, WC Wang… - Advances in Neural …, 2022 - proceedings.neurips.cc
On-device training enables the model to adapt to new data collected from the sensors by
fine-tuning a pre-trained model. Users can benefit from customized AI models without having …

Mastering atari games with limited data

W Ye, S Liu, T Kurutach, P Abbeel… - Advances in neural …, 2021 - proceedings.neurips.cc
Reinforcement learning has achieved great success in many applications. However, sample
efficiency remains a key challenge, with prominent methods requiring millions (or even …

Proxskip: Yes! local gradient steps provably lead to communication acceleration! finally!

K Mishchenko, G Malinovsky, S Stich… - International …, 2022 - proceedings.mlr.press
We introduce ProxSkip—a surprisingly simple and provably efficient method for minimizing
the sum of a smooth ($ f $) and an expensive nonsmooth proximable ($\psi $) function. The …

Efficient large language models: A survey

Z Wan, X Wang, C Liu, S Alam, Y Zheng, J Liu… - arXiv preprint arXiv …, 2023 - arxiv.org
Large Language Models (LLMs) have demonstrated remarkable capabilities in important
tasks such as natural language understanding and language generation, and thus have the …

Splitwise: Efficient generative llm inference using phase splitting

P Patel, E Choukse, C Zhang, A Shah… - 2024 ACM/IEEE 51st …, 2024 - ieeexplore.ieee.org
Generative large language model (LLM) applications are growing rapidly, leading to large-
scale deployments of expensive and power-hungry GPUs. Our characterization of LLM …

Optuna: A next-generation hyperparameter optimization framework

T Akiba, S Sano, T Yanase, T Ohta… - Proceedings of the 25th …, 2019 - dl.acm.org
The purpose of this study is to introduce new design-criteria for next-generation
hyperparameter optimization software. The criteria we propose include (1) define-by-run API …

Fedrolex: Model-heterogeneous federated learning with rolling sub-model extraction

S Alam, L Liu, M Yan, M Zhang - Advances in neural …, 2022 - proceedings.neurips.cc
Most cross-device federated learning (FL) studies focus on the model-homogeneous setting
where the global server model and local client models are identical. However, such …