Access: Advancing innovation: Nsf's advanced cyberinfrastructure coordination ecosystem: Services & support

TJ Boerner, S Deems, TR Furlani, SL Knuth… - Practice and Experience …, 2023 - dl.acm.org
As the National Science Foundation evolves its investments in cyberinfrastructure, it has
made a significant investment in the ACCESS (Advanced Cyberinfrastructure Coordination …

A review of supercomputer performance monitoring systems

KS Stefanov, S Pawar, A Ranjan… - Supercomputing …, 2021 - superfri.susu.ru
Abstract High Performance Computing is now one of the emerging fields in computer
science and its applications. Top HPC facilities, supercomputers, offer great opportunities in …

A slurm simulator: Implementation and parametric analysis

NA Simakov, MD Innus, MD Jones, RL DeLeon… - … , and Simulation: 8th …, 2018 - Springer
Slurm is an open-source resource manager for HPC that provides high configurability for
inhomogeneous resources and job scheduling. Various Slurm parametric settings can …

LIKWID Monitoring Stack: A flexible framework enabling job specific performance monitoring for the masses

T Röhl, J Eitzinger, G Hager… - 2017 IEEE International …, 2017 - ieeexplore.ieee.org
System monitoring is an established tool to measure the utilization and health of HPC
systems. Usually system monitoring infrastructures make no connection to job information …

A workload analysis of NSF's innovative HPC resources using XDMoD

NA Simakov, JP White, RL DeLeon, SM Gallo… - arXiv preprint arXiv …, 2018 - arxiv.org
Workload characterization is an integral part of performance analysis of high performance
computing (HPC) systems. An understanding of workload properties sheds light on resource …

A conceptual framework for HPC operational data analytics

A Netti, W Shin, M Ott, T Wilde… - 2021 IEEE International …, 2021 - ieeexplore.ieee.org
This paper provides a broad framework for understanding trends in Operational Data
Analytics (ODA) for High-Performance Computing (HPC) facilities. The goal of ODA is to …

A64FX performance: experience on Ookami

MAS Bari, B Chapman, A Curtis… - 2021 IEEE …, 2021 - ieeexplore.ieee.org
We examine the performance of scientific and engineering kernels on the Fujitsu A64FX
processor, both out-of-the-box using various toolchains and with processor-specific …

A resource utilization analytics platform using grafana and telegraf for the Savio supercluster

N Chan - Proceedings of the Practice and Experience in …, 2019 - dl.acm.org
Understanding high performance computing cluster utilization patterns is key for decision
making and efficient resource allocation. Cluster utilization statistics are useful for both …

Ookami: Deployment and initial experiences

A Burford, A Calder, D Carlson, B Chapman… - … and Experience in …, 2021 - dl.acm.org
Ookami [3] is a computer technology testbed supported by the United States National
Science Foundation. It provides researchers with access to the A64FX processor developed …

The Sol Supercomputer at Arizona State University

DM Jennewein, J Lee, C Kurtz, W Dizon… - … and Experience in …, 2023 - dl.acm.org
The Sol supercomputer provides ASU researchers access to a state-of-the-art system with
an observed GPU-only HPL speed of 2.272 PetaFLOP/s. This short paper provides a …