Prediction of job characteristics for intelligent resource allocation in HPC systems: a survey and future directions

Z Hou, H Shen, X Zhou, J Gu, Y Wang… - Frontiers of Computer …, 2022 - Springer
Nowadays, high-performance computing (HPC) clusters are increasingly popular. Large
volumes of job logs recording many years of operation traces have been accumulated. In the …

Search-based methods for multi-cloud configuration

M Lazuka, T Parnell, A Anghel… - 2022 IEEE 15th …, 2022 - ieeexplore.ieee.org
Multi-cloud computing has become increasingly popular with enterprises looking to avoid
vendor lock-in. While most cloud providers offer similar functionality, they may differ …

Selecting efficient cloud resources for hpc workloads

JR Brunetta, E Borin - Proceedings of the 12th IEEE/ACM International …, 2019 - dl.acm.org
Constant advances in CPU, storage, and network virtualization are enabling high-
performance computing (HPC) applications to be efficiently executed on cloud computing …

Performance analysis of parallel composite service-based applications in clouds

X Li, L Pan, W Song, S Liu, X Meng - Future Generation Computer Systems, 2024 - Elsevier
When processing composite service application jobs containing parallel tasks, service
providers can optimize their quality of services (QoS) based on refined parallelism settings …

Towards a novel framework for automatic big data detection

H Ahmed, MA Ismail - IEEE Access, 2020 - ieeexplore.ieee.org
Big data is a” relative” concept. It is the combination of data, application, and platform
properties. Recently, big data specific technologies have emerged, including software …

[PDF][PDF] On the promise and challenges of foundation models for learning-based cloud systems management

H Qiu, W Mao, CWH Franke, ZT Kalbarczyk… - … on Machine Learning …, 2023 - haoran-qiu.com
Foundation models (FMs) are machine learning models that are trained broadly on large-
scale data and can be adapted to a set of downstream tasks via fine-tuning, few-shot …

Performance prediction from simulation systems to physical systems using machine learning with transfer learning and scaling

A Mankodi, A Bhatt, B Chaudhury - … and Computation: Practice …, 2023 - Wiley Online Library
Selection from several computer systems with different hardware features resulting in
different software performance is a critical problem to solve. The problem becomes even …

Self-tuning serverless task farming using proactive elasticity control

S Kehrer, D Zietlow, J Scheffold, W Blochinger - Cluster Computing, 2021 - Springer
The cloud evolved into an attractive execution environment for parallel applications, which
make use of compute resources to speed up the computation of large problems in science …

Evaluation of neural network models for performance prediction of scientific applications

A Mankodi, A Bhatt, B Chaudhury - 2020 IEEE REGION 10 …, 2020 - ieeexplore.ieee.org
Performance prediction is an important and active research area. In particular, several
research efforts have built empirical models using machine learning algorithms for …

Containerised Application Profiling and Classification Using Benchmarks

A Psychas, P Dadamis, N Kapsoulis, A Litke… - Applied Sciences, 2022 - mdpi.com
Along with the rise of cloud and edge computing has come a plethora of solutions that
regard the deployment and operation of different types of applications in such environments …