Soft errors caused by transient bit flips have the potential to significantly impact an application's behavior. This has motivated the design of an array of techniques to detect …
Data reduction techniques have been widely demanded and used by large-scale high performance computing (HPC) applications because of vast volumes of data to be produced …
Understanding the impact of soft errors on applications can be expensive. Often, it requires an extensive error injection campaign involving numerous runs of the full application in the …
The resilience behavior of three GMRES prototyped implementations (with Incomplete LU, Flexible and randomized-SVD—based preconditioners) has been analyzed with a soft …
We propose a new way to detect and correct silent errors in the conjugate gradient algorithm. The detection criterion is simple, cheap to implement, and can be used at each …
J Chang, S Oh, D Park - 2022 International Conference on …, 2022 - ieeexplore.ieee.org
Detecting transient faults in safety-critical neural network (NN) applications operated on embedded systems has become a concern, but it is challenging to achieve high accuracy …
We present FPDetect, a low-overhead approach for detecting logical errors and soft errors affecting stencil computations without generating false positives. We develop an offline …
Soft errors caused by transient bit flips have the potential to significantly impactan applicalion's behavior. This has motivated the design of an array of techniques to detect …
Extremely large scale scientific simulation applications have been very important in many scientific domains including cosmology, climate, fluid dynamics, chemistry and so on. It has …