RFVP: Rollback-free value prediction with safe-to-approximate loads

A Yazdanbakhsh, G Pekhimenko, B Thwaites… - ACM Transactions on …, 2016 - dl.acm.org
This article aims to tackle two fundamental memory bottlenecks: limited off-chip bandwidth
(bandwidth wall) and long access latency (memory wall). To achieve this goal, our approach …

TREE: Tree Regularization for Efficient Execution

L Schmid, D Biebert, C Hakert, KH Chen… - arXiv preprint arXiv …, 2024 - arxiv.org
The rise of machine learning methods on heavily resource constrained devices requires not
only the choice of a suitable model architecture for the target platform, but also the …

DrPy: Pinpointing Inefficient Memory Usage in Multi-Layer Python Applications

J Cui, Q Zhao, Y Hao, X Liu - 2024 IEEE/ACM International …, 2024 - ieeexplore.ieee.org
Python has become an increasingly popular programming language, especially in the areas
of data analytics and machine learning. Many modern Python packages employ a multi …

Micronets: Neural network architectures for deploying tinyml applications on commodity microcontrollers

C Banbury, C Zhou, I Fedorov… - … of machine learning …, 2021 - proceedings.mlsys.org
Executing machine learning workloads locally on resource constrained microcontrollers
(MCUs) promises to drastically expand the application space of IoT. However, so-called …

Learning input-aware performance models of configurable systems: An empirical evaluation

L Lesoil, H Spieker, A Gotlieb, M Acher… - Journal of Systems and …, 2024 - Elsevier
Modern software-based systems are highly configurable and come with a number of
configuration options that impact the performance of the systems. However, selecting …

Worst-case energy-consumption analysis by microarchitecture-aware timing analysis for device-driven cyber-physical systems

P Raffeck, C Eichler, P Wägemann… - … Workshop on Worst …, 2019 - drops.dagstuhl.de
Many energy-constrained cyber-physical systems require both timeliness and the execution
of tasks within given energy budgets. That is, besides knowledge on worst-case execution …

Phronesis: Efficient performance modeling for high-dimensional configuration tuning

Y Li, BC Lee - ACM Transactions on Architecture and Code …, 2022 - dl.acm.org
We present Phronesis, a learning framework for efficiently modeling the performance of data
analytic workloads as a function of their high-dimensional software configuration …

When climate meets machine learning: Edge to cloud ML energy efficiency

D Marculescu - 2021 IEEE/ACM International Symposium on …, 2021 - ieeexplore.ieee.org
A large portion of current cloud and edge workloads feature Machine Learning (ML) tasks,
thereby requiring a deep understanding of their energy efficiency. While the holy grail for …

Optimal policy for deployment of machine learning models on energy-bounded systems

SI Mirzadeh, H Ghasemzadeh - … of the Twenty-Ninth International Joint …, 2020 - par.nsf.gov
With the recent advances in both machine learning and embedded systems research, the
demand to deploy computational models for real-time execution on edge devices has …

Implementing optimization-based control tasks in cyber-physical systems with limited computing capacity

M Hosseinzadeh, B Sinopoli… - … Design for Cyber …, 2022 - ieeexplore.ieee.org
A common aspect of today's cyber-physical systems is that multiple optimization-based
control tasks may execute in a shared processor. Such control tasks make use of online …