Predictive performance modeling for distributed batch processing using black box monitoring and machine learning

C Witt, M Bux, W Gusew, U Leser - Information Systems, 2019 - Elsevier
In many domains, the previous decade was characterized by increasing data volumes and
growing complexity of data analyses, creating new demands for batch processing on …

Learning-based phase-aware multi-core cpu workload forecasting

ESA Lozano, A Gerstlauer - ACM transactions on design automation of …, 2022 - dl.acm.org
Predicting workload behavior during workload execution is essential for dynamic resource
optimization in multi-processor systems. Recent studies have proposed advanced machine …

A survey of phase classification techniques for characterizing variable application behavior

K Criswell, T Adegbija - IEEE Transactions on Parallel and …, 2019 - ieeexplore.ieee.org
Adaptable computing is an increasingly important paradigm that specializes system
resources to variable application requirements, environmental conditions, or user …

Run-time program-specific phase prediction for python programs

MC Chiu, E Moss - Proceedings of the 15th International Conference on …, 2018 - dl.acm.org
It is well-known that a program's execution can be partitioned into different phases. Because
of their impact on micro-architectural components such as caches and branch predictors …

Learning-based workload phase classification and prediction using performance monitoring counters

ES Alcorta, A Gerstlauer - 2021 ACM/IEEE 3rd Workshop on …, 2021 - ieeexplore.ieee.org
Predicting coarse-grain variations in workload behavior during execution is essential for
dynamic resource optimization of processor systems. Researchers have proposed various …

Prophet: A parallel instruction-oriented many-core simulator

W Zhang, X Ji, Y Lu, H Wang, H Chen… - IEEE Transactions on …, 2017 - ieeexplore.ieee.org
Most existing computer architecture simulators are cycle oriented, ie, they are driven cycle
by cycle. However, frequent switches among simulation contexts, excessive buffer accesses …

ML for System-Level Modeling

ES Alcorta, P Brisk, A Gerstlauer - Machine Learning Applications in …, 2022 - Springer
Ever-increasing complexity and heterogeneity of systems and applications pose
fundamental new challenges for design, programming, and runtime management of …

SCFM: A Statistical Coarse-to-Fine Method to Select Cross-Microarchitecture Reliable Simulation Points

C Han, H Tan, T Zhang, X Li, R Wu, F Zhang - International Symposium on …, 2023 - Springer
With computer microarchitectures advancing and benchmark sizes expanding, the need for
agile pre-silicon performance estimation becomes increasingly crucial. SimPoint is a widely …

Energy-efficient thread mapping for heterogeneous many-core systems via dynamically adjusting the thread count

T Ju, Y Zhang, X Zhang, X Du, X Dong - Energies, 2019 - mdpi.com
Improving computing performance and reducing energy consumption are a major concern in
heterogeneous many-core systems. The thread count directly influences the computing …

[PDF][PDF] Hardware Prefetching Tuning Method Based on Program Phase Behavior

L Huang, L Yan, T Wu - Journal of Circuits, Systems and …, 2024 - researchgate.net
Modern high-performance processor systems universally employ hardware prefetch engines
to address the “memory wall” issue. Nonetheless, prefetchers are typically activated with the …