Understanding the reliability characteristics of supercomputers has been a key focus of the HPC and dependability communities. However, there is no current study that analyzes both …
K Fujita, N Sakamoto, T Fujiwara… - Journal of Advanced …, 2022 - jstage.jst.go.jp
The size and complexity of supercomputer systems and their power and cooling facilities have continuously increased, thus posing additional challenge for long-term and stable …
Understanding the reliability characteristics of supercomputers has been a key focus of the HPC and dependability communities. However, there is no current study that analyzes both …
Abstract Graphics Processing Units (GPUs) are becoming a de facto solution for accelerating a wide range of applications but remain susceptible to transient hardware faults …
J Nonaka, K Fujita, T Fujiwara… - VisGap-The Gap …, 2023 - da.lib.kobe-u.ac.jp
Flagship-class high-performance computing (HPC) systems, also known as supercomputers, are large, complex systems that require particular attention for continuous …