Influence of noisy environments on behavior of HPC applications

DA Nikitenko, F Wolf, B Mohr, T Hoefler… - Lobachevskii journal of …, 2021 - Springer
Many contemporary HPC systems expose their jobs to substantial amounts of interference,
leading to significant run-to-run variation. For example, application runtimes on Theta, a …

PerFlow: A domain specific framework for automatic performance analysis of parallel applications

Y Jin, H Wang, R Zhong, C Zhang, J Zhai - Proceedings of the 27th ACM …, 2022 - dl.acm.org
Performance analysis is widely used to identify performance issues of parallel applications.
However, complex communications and data dependence, as well as the interactions …

Scalable fine-grained call path tracing

NR Tallent, J Mellor-Crummey, M Franco… - Proceedings of the …, 2011 - dl.acm.org
Applications must scale well to make efficient use of even medium-scale parallel systems.
Because scaling problems are often difficult to diagnose, there is a critical need for scalable …

Tools for gpu computing–debugging and performance analysis of heterogenous hpc applications

M Knobloch, B Mohr - Supercomputing Frontiers and Innovations, 2020 - superfri.org
General purpose GPUs are now ubiquitous in high-end supercomputing. All but one (the
Japanese Fugaku system, which is based on ARM processors) of the announced (pre-) …

Reducing the overhead of direct application instrumentation using prior static analysis

J Mußler, D Lorenz, F Wolf - European Conference on Parallel Processing, 2011 - Springer
Preparing performance measurements of HPC applications is usually a tradeoff between
accuracy and granularity of the measured data. When using direct instrumentation, that is …

Framework for a productive performance optimization

H Servat, G Llort, K Huck, J Giménez, J Labarta - Parallel Computing, 2013 - Elsevier
Modern supercomputers deliver large computational power, but it is difficult for an
application to exploit such power. One factor that limits the application performance is the …

On the usefulness of object tracking techniques in performance analysis

G Llort, H Servat, J González, J Giménez… - Proceedings of the …, 2013 - dl.acm.org
Understanding the behavior of a parallel application is crucial if we are to tune it to achieve
its maximum performance. Yet the behavior the application exhibits may change over time …

ScalAna: Automating scaling loss detection with graph analysis

Y Jin, H Wang, T Yu, X Tang, T Hoefler… - … Conference for High …, 2020 - ieeexplore.ieee.org
Scaling a parallel program to modern supercomputers is challenging due to inter-process
communication, Amdahl's law, and resource contention. Performance analysis tools for …

Design and implementation of a hybrid parallel performance measurement system

A Morris, AD Malony, S Shende… - 2010 39th International …, 2010 - ieeexplore.ieee.org
Modern parallel performance measurement systems collect performance information either
through probes inserted in the application code or via statistical sampling. Probe-based …

Production application performance data streaming for system monitoring

R Izadpanah, BA Allan, D Dechev… - ACM Transactions on …, 2019 - dl.acm.org
In this article, we present an approach to streaming collection of application performance
data. Practical application performance tuning and troubleshooting in production high …