作者
Gokcen Kestor, Burcu Ozcelik Mutlu, Joseph Manzano, Omer Subasi, Osman Unsal, Sriram Krishnamoorthy
发表日期
2018/5/8
图书
Proceedings of the 15th ACM International Conference on Computing Frontiers
页码范围
173-182
简介
Undetected soft errors caused by transient bit flips can lead to silent data corruption (SDC), an undesirable outcome where invalid results pass for valid ones. This has motivated the design of soft error detectors to minimize SDCs. However, the detectors have been studied under different contexts, making comparative evaluation difficult. In this paper, we present the first comprehensive evaluation of four online soft error detection techniques in detecting the adverse impact of soft errors on iterative methods. We observe that, across five iterative methods, the detectors studied achieve high but not perfect detection rates. To understand the potential for improved detection, we evaluate a machine-learning based detector that takes as features that are the runtime features observed by the individual detectors to arrive at their conclusions. Our evaluation demonstrates improved but still far from perfect detection accuracy for …
引用总数
20182019202020212022202312422
学术搜索中的文章
G Kestor, BO Mutlu, J Manzano, O Subasi, O Unsal… - Proceedings of the 15th ACM International Conference …, 2018