Prodigy: Towards unsupervised anomaly detection in production hpc systems

B Aksar, E Sencan, B Schwaller, O Aaziz… - Proceedings of the …, 2023 - dl.acm.org
Performance variations caused by anomalies in modern High Performance Computing
(HPC) systems lead to decreased efficiency, impaired application performance, and …

Runtime Performance Anomaly Diagnosis in Production HPC Systems Using Active Learning

B Aksar, E Sencan, B Schwaller, O Aaziz… - … on Parallel and …, 2024 - ieeexplore.ieee.org
With the increasing scale and complexity of High-Performance Computing (HPC) systems,
performance variations in applications caused by anomalies have become significant …

Towards Practical Machine Learning Frameworks for Performance Diagnostics in Supercomputers

B Aksar, E Sencan, B Schwaller, VJ Leung… - Proceedings of the First …, 2023 - dl.acm.org
Supercomputers are highly sophisticated computing systems designed to handle complex
and computationally intensive tasks. Despite their tremendous efficiency, performance …

Machine learning-based performance analytics for high-performance computing systems

B Aksar - 2024 - search.proquest.com
High-performance Computing (HPC) systems play pivotal roles in societal and scientific
advancements, executing up to quintillions (10 18) of calculations every second. As we shift …

[PDF][PDF] Online Performance Observation for HPC Applications

D Yokelson - 2024 - cs.uoregon.edu
This chapter contains unpublished and published material with and without co-authorship.
Sections 2.1, 2.2, 2.3, 2.4, 2.5 contain material from the departmental requirement, the Area …

Machine Learning-based Performance Analytics in Computer Systems

E Sencan, B Aksar, YC Lee, R Chen, AK Coskun - bu.edu
Machine Learning-based Performance Analytics in Computer Systems Page 1 Machine
Learning-based Performance Analytics in Computer Systems Efe Sencan, Burak Aksar, Yin-Ching …