Comparative analysis of soft-error detection strategies: A case study with iterative methods

G Kestor, BO Mutlu, J Manzano, O Subasi… - Proceedings of the 15th …, 2018 - dl.acm.org
Undetected soft errors caused by transient bit flips can lead to silent data corruption (SDC),
an undesirable outcome where invalid results pass for valid ones. This has motivated the …

Neural network based silent error detector

C Wang, N Dryden, F Cappello… - 2018 IEEE International …, 2018 - ieeexplore.ieee.org
As we move toward exascale platforms, silent data corruptions (SDC) are likely to occur
more frequently. Such errors can lead to incorrect results. Attempts have been made to use …

Soft error detection for iterative applications using offline training

J Liu, G Agrawal - 2016 IEEE 23rd International Conference on …, 2016 - ieeexplore.ieee.org
Silent data corruption (SDC) from soft errors is one of the challenges for Exascale systems
as the number of cores is increasing and the feature size is decreasing. In recent years, a …

[PDF][PDF] Toward effective detection of silent data corruptions for hpc applications

S Di, E Berrocal, L Bautista-Gomez… - Proceedings of the …, 2014 - sc14.supercomputing.org
Because of the large number of components, future extreme-scale systems are expected to
suffer a lot of silent data corruptions. Changes caused by silent errors flipping low-order bit …

Ground-truth prediction to accelerate soft-error impact analysis for iterative methods

BO Mutlu, G Kestor, A Cristal, O Unsal… - 2019 IEEE 26th …, 2019 - ieeexplore.ieee.org
Understanding the impact of soft errors on applications can be expensive. Often, it requires
an extensive error injection campaign involving numerous runs of the full application in the …

Injecting errors for fun and profit

S Chessin - Communications of the ACM, 2010 - dl.acm.org
Injecting errors for fun and profit Page 1 48 CommuniCations of the aCm | sEPtEMbEr 2010 |
vol. 53 | no. 9 practice “That which isn’t tested is broken.” —Author unknown “Well, everything …

Characterization of the impact of soft errors on iterative methods

BO Mutlu, G Kestor, J Manzano, O Unsal… - 2018 IEEE 25th …, 2018 - ieeexplore.ieee.org
Soft errors caused by transient bit flips have the potential to significantly impact an
application's behavior. This has motivated the design of an array of techniques to detect …

B-SEFI: a binary level soft error fault injection tool

Y Wang, J Dong, S Zhang, D Zuo - 2019 IEEE 19th …, 2019 - ieeexplore.ieee.org
Soft errors are becoming more prominent in modern computing systems due to the
increasing integration of chips. These faults pose a major challenge for memories and logic …

Detecting silent data corruptions in the wild

HD Dixit, L Boyle, G Vunnam, S Pendharkar… - arXiv preprint arXiv …, 2022 - arxiv.org
Silent Errors within hardware devices occur when an internal defect manifests in a part of the
circuit which does not have check logic to detect the incorrect circuit operation. The results of …

One bit is (not) enough: An empirical study of the impact of single and multiple bit-flip errors

B Sangchoolie, K Pattabiraman… - 2017 47th annual IEEE …, 2017 - ieeexplore.ieee.org
Recent studies have shown that technology and voltage scaling are expected to increase
the likelihood that particle-induced soft errors manifest as multiple-bit errors. This raises …