Machine learning for microcontroller-class hardware: A review

SS Saha, SS Sandha, M Srivastava - IEEE Sensors Journal, 2022 - ieeexplore.ieee.org
The advancements in machine learning (ML) opened a new opportunity to bring intelligence
to the low-end Internet-of-Things (IoT) nodes, such as microcontrollers. Conventional ML …

Enabling Resource-Efficient AIoT System With Cross-Level Optimization: A Survey

S Liu, B Guo, C Fang, Z Wang, S Luo… - … Surveys & Tutorials, 2023 - ieeexplore.ieee.org
The emerging field of artificial intelligence of things (AIoT, AI+ IoT) is driven by the
widespread use of intelligent infrastructures and the impressive success of deep learning …

Elasticvit: Conflict-aware supernet training for deploying fast vision transformer on diverse mobile devices

C Tang, LL Zhang, H Jiang, J Xu… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract Neural Architecture Search (NAS) has shown promising performance in the
automatic design of vision transformers (ViT) exceeding 1G FLOPs. However, designing …

Towards efficient vision transformer inference: A first study of transformers on mobile devices

X Wang, LL Zhang, Y Wang, M Yang - Proceedings of the 23rd annual …, 2022 - dl.acm.org
Convolution neural networks (CNNs) have long been dominating the model choice in on-
device intelligent mobile applications. Recently, we are witnessing the fast development of …

[PDF][PDF] CoDL: efficient CPU-GPU co-execution for deep learning inference on mobile devices.

F Jia, D Zhang, T Cao, S Jiang, Y Liu, J Ren, Y Zhang - MobiSys, 2022 - chrisplus.me
Concurrent inference execution on heterogeneous processors is critical to improve the
performance of increasingly heavy deep learning (DL) models. However, available …

Flash: Heterogeneity-aware federated learning at scale

C Yang, M Xu, Q Wang, Z Chen… - IEEE Transactions …, 2022 - ieeexplore.ieee.org
Federated learning (FL) becomes a promising machine learning paradigm. The impact of
heterogeneous hardware specifications and dynamic states on the FL process has not yet …

One proxy device is enough for hardware-aware neural architecture search

B Lu, J Yang, W Jiang, Y Shi, S Ren - … of the ACM on Measurement and …, 2021 - dl.acm.org
Convolutional neural networks (CNNs) are used in numerous real-world applications such
as vision-based autonomous driving and video content analysis. To run CNN inference on …

Maple-edge: A runtime latency predictor for edge devices

S Nair, S Abbasi, A Wong… - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com
Abstract Neural Architecture Search (NAS) has enabled automatic discovery of more
efficient neural network architectures, especially for mobile and embedded vision …

Latency-constrained DNN architecture learning for edge systems using zerorized batch normalization

S Huai, D Liu, H Kong, W Liu, R Subramaniam… - Future Generation …, 2023 - Elsevier
Deep learning applications have been widely adopted on edge devices, to mitigate the
privacy and latency issues of accessing cloud servers. Deciding the number of neurons …

Nnlqp: A multi-platform neural network latency query and prediction system with an evolving database

L Liu, M Shen, R Gong, F Yu, H Yang - Proceedings of the 51st …, 2022 - dl.acm.org
Deep neural networks (DNNs) are widely used in various applications. The accurate and
latency feedback is essential for model design and deployment. In this work, we attempt to …