Uncovering access, reuse, and sharing characteristics of {I/O-Intensive} files on {Large-Scale}...

B Yang, W Xue, T Zhang, S Liu, X Ma, X Wang… - ACM Transactions on …, 2023 - dl.acm.org

This paper offers a solution to overcome the complexities of production system I/O
performance monitoring. We present Beacon, an end-to-end I/O resource monitoring and …

被引用次数：77 相关文章所有 6 个版本

[PDF] google.com

Systematically inferring I/O performance variability by examining repetitive job behavior

E Costa, T Patel, B Schwaller, JM Brandt… - Proceedings of the …, 2021 - dl.acm.org

Monitoring and analyzing I/O behaviors is critical to the efficient utilization of parallel storage
systems. Unfortunately, with increasing I/O requirements and resource contention, I/O …

被引用次数：23 相关文章所有 4 个版本

[PDF] acm.org

Access patterns and performance behaviors of multi-layer supercomputer i/o subsystems under production load

JL Bez, AM Karimi, AK Paul, B Xie, S Byna… - Proceedings of the 31st …, 2022 - dl.acm.org

Scientific computing workloads at HPC facilities have been shifting from traditional
numerical simulations to AI/ML applications for training and inference while processing and …

被引用次数：20 相关文章所有 7 个版本

[PDF] osti.gov

Understanding hpc application i/o behavior using system level statistics

AK Paul, O Faaland, A Moody… - 2020 IEEE 27th …, 2020 - ieeexplore.ieee.org

The processor performance of high performance computing (HPC) systems is increasing at
a much higher rate than storage performance. This imbalance leads to I/O performance …

被引用次数：39 相关文章所有 9 个版本

[PDF] usenix.org

{StRAID}: Stripe-threaded Architecture for Parity-based {RAIDs} with Ultra-fast {SSDs}

S Wang, Q Cao, Z Lu, H Jiang, J Yao… - 2022 USENIX Annual …, 2022 - usenix.org

Popular software storage architecture Linux Multiple-Disk (MD) for parity-based RAID (eg,
RAID5 and RAID6) assigns one or more centralized worker threads to efficiently process all …

被引用次数：15 相关文章所有 6 个版本

[PDF] arxiv.org

tf-Darshan: Understanding fine-grained I/O performance in machine learning workloads

SWD Chien, A Podobas, IB Peng… - 2020 IEEE International …, 2020 - ieeexplore.ieee.org

Machine Learning applications on HPC systems have been gaining popularity in recent
years. The upcoming large scale systems will offer tremendous parallelism for training …

被引用次数：21 相关文章所有 6 个版本

[PDF] usenix.org

Full lifecycle data analysis on a large-scale and leadership supercomputer: what can we learn from it?

B Yang, H Wei, W Zhu, Y Zhang, W Liu… - 2024 USENIX Annual …, 2024 - usenix.org

The system architecture of contemporary supercomputers is growing increasingly intricate
with the ongoing evolution of system-wide network and storage technologies, making it …

[PDF] uta.edu

Explorations and Exploitation for Parity-based RAIDs with Ultra-fast SSDs

S Wang, Q Cao, H Jiang, Z Lu, J Yao, Y Chen… - ACM Transactions on …, 2024 - dl.acm.org

Following a conventional design principle that pays more fast-CPU-cycles for fewer slow-
I/Os, popular software storage architecture Linux Multiple-Disk (MD) for parity-based RAID …

被引用次数：2 相关文章所有 2 个版本

ScaleCache: A Scalable Page Cache for Multiple Solid-State Drives

KT Pham, S Cho, S Lee, LA Nguyen, H Yeo… - Proceedings of the …, 2024 - dl.acm.org

This paper presents a scalable page cache called ScaleCache for improving SSD
scalability. Specifically, we first propose a concurrent data structure of page cache based on …

被引用次数：4 相关文章所有 3 个版本

[PDF] akougkas.io

DaYu: Optimizing Distributed Scientific Workflows by Decoding Dataflow Semantics and Dynamics

M Tang, J Cernuda, J Ye, L Guo… - 2024 IEEE …, 2024 - ieeexplore.ieee.org

The combination of ever-growing scientific datasets and distributed workflow complexity
creates I/O performance bottlenecks due to data volume, velocity, and variety. Although the …

被引用次数：1 相关文章所有 4 个版本

高级搜索

QQ 群