Power-performance characterization of tinyml systems

Y Zhang, D Wijerathne, Z Li… - 2022 IEEE 40th …, 2022 - ieeexplore.ieee.org
TinyML systems are enabling machine learning (ML) inference at the edge. However, there
exists little quantitative analysis of such systems. This paper presents a systematic …

Phronesis: Efficient performance modeling for high-dimensional configuration tuning

Y Li, BC Lee - ACM Transactions on Architecture and Code …, 2022 - dl.acm.org
We present Phronesis, a learning framework for efficiently modeling the performance of data
analytic workloads as a function of their high-dimensional software configuration …

SpecPIM: Accelerating Speculative Inference on PIM-Enabled System via Architecture-Dataflow Co-Exploration

C Li, Z Zhou, S Zheng, J Zhang, Y Liang… - Proceedings of the 29th …, 2024 - dl.acm.org
Generative large language models'(LLMs) inference suffers from inefficiency because of the
token dependency brought by autoregressive decoding. Recently, speculative inference has …

When climate meets machine learning: Edge to cloud ML energy efficiency

D Marculescu - 2021 IEEE/ACM International Symposium on …, 2021 - ieeexplore.ieee.org
A large portion of current cloud and edge workloads feature Machine Learning (ML) tasks,
thereby requiring a deep understanding of their energy efficiency. While the holy grail for …

System performance optimization via design and configuration space exploration

C Tang - Proceedings of the 2017 11th Joint Meeting on …, 2017 - dl.acm.org
The runtime performance of a software system often depends on a large number of static
parameters, which usually interact in complex ways to carry out system functionality and …

[PDF][PDF] Tensorflow: A system for large-scale machine learning

A Agarwal, P Barham, E Brevdo… - Proceedings of the …, 2016 - andreask.cs.illinois.edu
• Cost model based on input and output size and computation time for a device.• Can be
heuristic or based on measured time for previous placement decisions.• Device for a node is …

Optimal policy for deployment of machine learning models on energy-bounded systems

SI Mirzadeh, H Ghasemzadeh - … of the Twenty-Ninth International Joint …, 2020 - par.nsf.gov
With the recent advances in both machine learning and embedded systems research, the
demand to deploy computational models for real-time execution on edge devices has …

Sampling effect on performance prediction of configurable systems: A case study

J Alves Pereira, M Acher, H Martin… - Proceedings of the ACM …, 2020 - dl.acm.org
Numerous software systems are highly configurable and provide a myriad of configuration
options that users can tune to fit their functional and performance requirements (eg …

System-aware optimization for machine learning at scale

V Smith - 2017 - escholarship.org
New computing systems have emerged in response to the increasing size and complexity of
modern datasets. For best performance, machine learning methods must be designed to …

Adaptive random forests for energy-efficient inference on microcontrollers

F Daghero, A Burrello, C Xie, L Benini… - 2021 IFIP/IEEE 29th …, 2021 - ieeexplore.ieee.org
Random Forests (RFs) are widely used Machine Learning models in low-power embedded
devices, due to their hardware friendly operation and high accuracy on practically relevant …