Containerization for high performance computing systems: Survey and prospects

N Zhou, H Zhou, D Hoppe - IEEE Transactions on Software …, 2022 - ieeexplore.ieee.org
Containers improve the efficiency in application deployment and thus have been widely
utilised on Cloud and lately in High Performance Computing (HPC) environments …

Container orchestration on HPC systems through Kubernetes

N Zhou, Y Georgiou, M Pospieszny, L Zhong… - Journal of Cloud …, 2021 - Springer
Containerisation demonstrates its efficiency in application deployment in Cloud Computing.
Containers can encapsulate complex programs with their dependencies in isolated …

Container orchestration on HPC systems

N Zhou, Y Georgiou, L Zhong, H Zhou… - 2020 IEEE 13th …, 2020 - ieeexplore.ieee.org
Containerisation demonstrates its efficiency in application deployment in cloud computing.
Containers can encapsulate complex programs with their dependencies in isolated …

Real-life experience with major reconfiguration of job scheduling system

D Klusáček, Š Tóth, G Podolníková - … for Parallel Processing: 19th and 20th …, 2017 - Springer
This work describes the goals and impacts of a large reconfiguration of the job scheduling
system, used in the Czech National Grid and Cloud infrastructure MetaCentrum, which was …

A hybrid scheduling platform: a runtime prediction reliability aware scheduling platform to improve hpc scheduling performance

M Naghshnejad, M Singhal - The Journal of Supercomputing, 2020 - Springer
The performance of scheduling algorithms for HPC jobs highly depends on the accuracy of
job runtime values. Prior research has established that neither user-provided runtimes nor …

Exploring plan-based scheduling for large-scale computing systems

X Zheng, Z Zhou, X Yang, Z Lan… - 2016 IEEE International …, 2016 - ieeexplore.ieee.org
As HPC systems scale toward exascale, it becomes critical to manage the underlying
resource more effectively. While almost all existing resource management systems schedule …

Planning and metaheuristic optimization in production job scheduler

D Klusáček, V Chlumský - … Strategies for Parallel Processing: 19th and …, 2017 - Springer
In this work we present our positive experience with a unique advanced job scheduler which
we have developed for the widely used TORQUE Resource Manager. Unlike common …

ZeroSum: User Space Monitoring of Resource Utilization and Contention on Heterogeneous HPC Systems

K Huck, A Malony - Proceedings of the SC'23 Workshops of The …, 2023 - dl.acm.org
Heterogeneous High Performance Computing (HPC) systems are highly specialized,
complex, powerful, and expensive systems. Efficient utilization of these systems requires …

An Energy‐Efficient Task Scheduling Mechanism with Switching On/Sleep Mode of Servers in Virtualized Cloud Data Centers

C Yin, J Liu, S Jin - Mathematical Problems in Engineering, 2020 - Wiley Online Library
In recent years, the energy consumption of cloud data centers has continued to increase. A
large number of servers run at a low utilization rate, which results in a great waste of power …

Containerization and orchestration on HPC systems

N Zhou - Sustained Simulation Performance 2019 and 2020 …, 2021 - Springer
Containerization demonstrates its efficiency in application deployment in Cloud clusters.
HPC systems start to adopt containers, as containers can encapsulate complex programs …