Using XDMoD to facilitate XSEDE operations, planning and analysis

M Turilli, M Santcroos, S Jha - ACM Computing Surveys (CSUR), 2018 - dl.acm.org

Pilot-Job systems play an important role in supporting distributed scientific computing. They
are used to execute millions of jobs on several cyberinfrastructures worldwide, consuming …

被引用次数：87 相关文章所有 7 个版本

[PDF] nsf.gov

Open XDMoD: A tool for the comprehensive management of high-performance computing resources

JT Palmer, SM Gallo, TR Furlani… - … in Science & …, 2015 - ieeexplore.ieee.org

Open XDMoD is an open source tool designed to facilitate the management of high-
performance computing (HPC) systems. The Open XDMoD portal provides a rich set of …

被引用次数：110 相关文章所有 6 个版本

[PDF] superfri.org

Deep analysis of job state statistics on Lomonosov-2 supercomputer

DA Nikitenko, VV Voevodin, SA Zhumatiy - … Frontiers and Innovations, 2018 - superfri.org

It is a common knowledge that the increasingly growing capabilities of HPC systems are
always limited by a number of efficiency related issues. The reasons can be very different …

被引用次数：20 相关文章所有 9 个版本

[PDF] researchgate.net

First Impressions of the NVIDIA Grace CPU Superchip and NVIDIA Grace Hopper Superchip for Scientific Workloads

NA Simakov, MD Jones, TR Furlani… - Proceedings of the …, 2024 - dl.acm.org

The engineering samples of the NVIDIA Grace CPU Superchip and NVIDIA Grace Hopper
Superchips were tested using different benchmarks and scientific applications. The …

被引用次数：4 相关文章所有 7 个版本

Understanding application and system performance through system-wide monitoring

RT Evans, JC Browne, WL Barth - 2016 IEEE International …, 2016 - ieeexplore.ieee.org

TACC Stats is a continuous monitoring tool for HPC systems that collects data at the core
and process level for every job executing on a monitored system. That data can be …

被引用次数：20 相关文章所有 2 个版本

[PDF] acm.org

Are we ready for broader adoption of ARM in the HPC community: Performance and Energy Efficiency Analysis of Benchmarks and Applications Executed on High …

NA Simakov, RL Deleon, JP White, MD Jones… - Proceedings of the …, 2023 - dl.acm.org

A set of benchmarks, including numerical libraries and real-world scientific applications,
were run on several modern ARM systems (Amazon Graviton 3/2, Futjutsu A64FX, Ampere …

被引用次数：5 相关文章所有 4 个版本

[PDF] nsf.gov

Analysis of XDMoD/SUPReMM data using machine learning techniques

SM Gallo, JP White, RL DeLeon… - 2015 IEEE …, 2015 - ieeexplore.ieee.org

Machine learning techniques were applied to job accounting and performance data for
application classification. Job data were accumulated using the XDMoD monitoring …

被引用次数：18 相关文章所有 5 个版本

[PDF] nsf.gov

Comprehensive, open‐source resource usage measurement and analysis for HPC systems

JC Browne, RL DeLeon, AK Patra… - Concurrency and …, 2014 - Wiley Online Library

The important role high‐performance computing (HPC) resources play in science and
engineering research, coupled with its high cost (capital, power and manpower), short life …

被引用次数：19 相关文章所有 7 个版本

[PDF] arxiv.org

Integrating abstractions to enhance the execution of distributed applications

M Turilli, F Liu, Z Zhang, A Merzky… - 2016 IEEE …, 2016 - ieeexplore.ieee.org

One of the factors that limits the scale, performance, and sophistication of distributed
applications is the difficulty of concurrently executing them on multiple distributed computing …

被引用次数：20 相关文章所有 9 个版本

[PDF] nsf.gov

Application kernels: HPC resources performance monitoring and variance analysis

NA Simakov, JP White, RL DeLeon… - Concurrency and …, 2015 - Wiley Online Library

Application kernels are computationally lightweight benchmarks or applications run
repeatedly on high performance computing (HPC) clusters in order to track the Quality of …

被引用次数：13 相关文章所有 5 个版本

高级搜索

QQ 群