Design, use and evaluation of p-fsefi: A parallel soft error fault injection framework for...- 学术资源搜索

Design, use and evaluation of p-fsefi: A parallel soft error fault injection framework for emulating soft errors in parallel applications

Q Guan, N BeBardeleben, P Wu, S Eidenbenz… - Proceedings of the 9th …, 2016 - dl.acm.org

Q Guan, N BeBardeleben, P Wu, S Eidenbenz, S Blanchard, L Monroe, E Baseman, L Tan

Proceedings of the 9th EAI International Conference on Simulation Tools and …, 2016•dl.acm.org

Future exascale application programmers and users have a need to quantity an application's resilience and vulnerability to soft errors before running their codes on production supercomputers due to the cost of failures and hazards from silent data corruption. Barring a deep understanding of the resiliency of a particular application, vulnerability evaluation is commonly done through fault injection tools at either the software or hardware level. Hardware fault injection, while most realistic, is relegated to customized vendor chips and usually applications cannot be evaluated at scale. Software fault injection can be done more practically and efficiently and is the approach that many researchers use as a reasonable approximation. With a sufficiently sophisticated software fault injection framework, an application can be studied to see how it would handle many of the errors that manifest at the application level. Using such a tool, a developer can progressively improve the resilience at targeted locations they believe are important for their target hardware.

ACM Digital Library

展开收起

被引用次数：25 相关文章所有 2 个版本

以上显示的是最相近的搜索结果。查看全部搜索结果

高级搜索

QQ 群

Design, use and evaluation of p-fsefi: A parallel soft error fault injection framework for emulating soft errors in parallel applications

引用