Combining machine learning techniques and genetic algorithm for predicting run times of high performance computing jobs

S Ramachandran, ML Jayalal, M Vasudevan… - Applied Soft …, 2024 - Elsevier
This study proposes a novel approach combining Machine Learning (ML) techniques and
Genetic Algorithms (GA) for predicting High-Performance Computing (HPC) job run times …

[HTML][HTML] AMPRO-HPCC: A machine-learning tool for predicting resources on slurm HPC clusters

M Tanash, D Andresen, W Hsu - ADVCOMP... the... International …, 2021 - ncbi.nlm.nih.gov
Determining resource allocations (memory and time) for submitted jobs in High Performance
Computing (HPC) systems is a challenging process even for computer scientists. HPC users …

LS-HTC: an HTC system for large-scale jobs

J Hu, X Che, B Kan, Y Shao - CCF Transactions on High Performance …, 2024 - Springer
High throughput computing (HTC) uses mass computing resources over long periods of time
to accomplish a batch of short fast jobs, it is widely employed by Simulation Computation …

Mastering HPC Runtime Prediction: From Observing Patterns to a Methodological Approach

K Menear, A Nag, J Perr-Sauer, M Lunacek… - … and Experience in …, 2023 - dl.acm.org
The continual expansion of high-performance computing (HPC) brings with it an increasing
need for efficiency. Heavy investment in energy, hardware, and software infrastructure to …

Dynamic Memory Provisioning on Disaggregated HPC Systems

F Zacarias, P Carpenter, V Petrucci - … of the SC'23 Workshops of The …, 2023 - dl.acm.org
Disaggregated memory is under investigation as a way to break the rigid boundaries
between node memory hierarchies in order to provide memory as a system-wide pooled …

Is Knowledge about Running Applications Helping Improve Runtime Prediction of HPC Jobs?

K Menear, D Duplyakin - Practice and Experience in Advanced Research …, 2023 - dl.acm.org
High-performance computing systems rely upon scheduling algorithms to achieve high
utilization. These schedulers rely upon user estimates of job resource requirements, such as …

[图书][B] Improving HPC system performance by predicting job resources for submitted jobs using machine learning techniques

M Tanash - 2021 - search.proquest.com
Abstract Overestimation of High Performance Computing (HPC) job resources allocation
typically happens because of the wide variety of HPC applications, environment …

Job scheduling for disaggregated memory in high performance computing systems

F Vieira Zacarias - 2023 - upcommons.upc.edu
(English) In a typical HPC cluster system, a node is the elemental component unit of this
architecture. Memory and compute resources are tightly coupled in each node and the rigid …

[PDF][PDF] Data Characterization and Anomaly Detection for HPC Datacenters Using Machine Learning

W Liang - 2023 - atlarge-research.com
In the domain of High-Performance Computing (HPC), anomaly detection emerges as a
pivotal challenge. This research delves deeply into the architecture of Lisa and presents a …

[PDF][PDF] JREP-A Job Runtime Ensemble Predictor for Improving Scheduling Performance on High Performance Computing Systems

TH Le Hai, T Nguyen - researchgate.net
Efficient resource utilization in High Performance Computing (HPC) systems heavily relies
on accurate job runtime prediction. This paper introduces JREP (Job Runtime Ensemble …