Machine learning for microcontroller-class hardware: A review

SS Saha, SS Sandha, M Srivastava - IEEE Sensors Journal, 2022 - ieeexplore.ieee.org
The advancements in machine learning (ML) opened a new opportunity to bring intelligence
to the low-end Internet-of-Things (IoT) nodes, such as microcontrollers. Conventional ML …

Micronets: Neural network architectures for deploying tinyml applications on commodity microcontrollers

C Banbury, C Zhou, I Fedorov… - … of machine learning …, 2021 - proceedings.mlsys.org
Executing machine learning workloads locally on resource constrained microcontrollers
(MCUs) promises to drastically expand the application space of IoT. However, so-called …

Federated learning for resource-constrained iot devices: Panoramas and state of the art

A Imteaj, K Mamun Ahmed, U Thakker, S Wang… - Federated and Transfer …, 2022 - Springer
Nowadays, devices are equipped with advanced sensors with higher processing and
computing capabilities. Besides, widespread Internet availability enables communication …

S2ta: Exploiting structured sparsity for energy-efficient mobile cnn acceleration

ZG Liu, PN Whatmough, Y Zhu… - 2022 IEEE International …, 2022 - ieeexplore.ieee.org
Exploiting sparsity is a key technique in accelerating quantized convolutional neural network
(CNN) inference on mobile devices. Prior sparse CNN accelerators largely exploit …

Abstracting Sparse DNN Acceleration via Structured Sparse Tensor Decomposition

G Jeong, PA Tsai, AR Bambhaniya, SW Keckler… - arXiv preprint arXiv …, 2024 - arxiv.org
Exploiting sparsity in deep neural networks (DNNs) has been a promising area to meet the
growing computation need of modern DNNs. However, in practice, sparse DNN acceleration …

Fast Kronecker Matrix-Matrix Multiplication on GPUs

A Jangda, M Yadav - Proceedings of the 29th ACM SIGPLAN Annual …, 2024 - dl.acm.org
Kronecker Matrix-Matrix Multiplication (Kron-Matmul) is the multiplication of a matrix with the
Kronecker Product of several smaller matrices. Kron-Matmul is a core operation for many …

Sparse Iso-FLOP Transformations for Maximizing Training Efficiency

V Thangarasa, S Saxena, A Gupta… - Workshop on Advancing …, 2023 - openreview.net
Recent works have explored the use of weight sparsity to improve the training efficiency (test
accuracy wrt training FLOPs) of deep neural networks (DNNs). These works aim to reduce …

Hardware-Software Techniques for Accelerating Sparse Computation

M Soltaniyeh - 2022 - search.proquest.com
Linear algebra kernels are widely used in various fields such as machine learning, data
science, physical science, and graph analysis. Many of these applications work with sparse …

DENNI: Distributed Neural Network Inference on Severely Resource Constrained Edge Devices

R Sahu, R Toepfer, MD Sinclair… - 2021 IEEE International …, 2021 - ieeexplore.ieee.org
Pervasive intelligence promises to revolutionize society from Industrial Internet of Things
(IIoT), to smart infrastructure and homes, to personal health monitoring. Unfortunately, many …

On-board processing with AI for more autonomous and capable satellite systems

T Lund - 2022 - diva-portal.org
While the use of Artificial Intelligence (AI) has faced a sharp up-rise in popularity in ground-
based industries, such as for autonomous navigation in the automotive industry and …