SOMA: Observability, monitoring, and in situ analytics for exascale applications

D Yokelson, O Lappi, S Ramesh… - Concurrency and …, 2024 - Wiley Online Library
With the rise of exascale systems and large, data‐centric workflows, the need to observe
and analyze high performance computing (HPC) applications during their execution is …

HEPnOS: A specialized data service for high energy physics analysis

S Ali, S Calvez, P Carns, M Dorier… - 2023 IEEE …, 2023 - ieeexplore.ieee.org
In this paper, we present HEPnOS, a distributed data service for managing data produced by
high-energy physics (HEP) experiments. Using HEPnOS, HEP applications can use HPC …

[PDF][PDF] Observability, monitoring, and in situ analytics in exascale applications

D Yokelson, O Lappi, S Ramesh, M Vaisala, K Huck… - Cray User Group, 2023 - cug.org
With the rise of exascale systems and large, datacentric workflows, the need to observe and
analyze high performance computing (HPC) applications during their execution is becoming …

Enabling Performance Observability for Heterogeneous HPC Workflows with SOMA

D Yokelson, M Titov, S Ramesh, O Kilic… - Proceedings of the 53rd …, 2024 - dl.acm.org
Heterogeneous workflows represent a promising approach for overcoming traditional
application performance limitations and to accelerate scientific insight on high-performance …

Extending the Mochi Methodology to Enable Dynamic HPC Data Services

M Dorier, P Carns, R Ross, S Snyder… - 2024 IEEE …, 2024 - ieeexplore.ieee.org
High-performance computing (HPC) applications and workflows are increasingly making
use of custom data services to complement traditional parallel file systems with fast transient …

[PDF][PDF] Enabling Performance Observability for Heterogeneous HPC Workflows with SOMA

M Titov, D Yokelson, S Ramesh, O Kilic, M Turilli, S Jha… - 2024 - osti.gov
Heterogeneous workflows represent a promising approach for overcoming traditional
application performance limitations and to accelerate scientific insight on high-performance …

[PDF][PDF] Online Performance Observation for HPC Applications

D Yokelson - 2024 - cs.uoregon.edu
This chapter contains unpublished and published material with and without co-authorship.
Sections 2.1, 2.2, 2.3, 2.4, 2.5 contain material from the departmental requirement, the Area …

[图书][B] Performance Observability and Monitoring of High Performance Computing with Microservices

S Ramesh - 2022 - search.proquest.com
Abstract Traditionally, High Performance Computing (HPC) software has been built and
deployed as bulk-synchronous, parallel executables based on the message-passing …