The complexity of safety-related embedded computer systems is steadily increasing. Besides verifying that such systems implement the correct functionality, it is essential to …
SKS Hari, SV Adve, H Naeimi… - ACM SIGARCH …, 2012 - dl.acm.org
Future microprocessors need low-cost solutions for reliable operation in the presence of failure-prone devices. A promising approach is to detect hardware faults by deploying low …
A Löfwenmark, S Nadjm-Tehrani - Journal of Systems Architecture, 2018 - Elsevier
With more functionality added to future safety-critical avionics systems, new platforms are required to offer the computational capacity needed. Multi-core processors offer a potential …
CF Chandler, C Leangsuksun… - Proceedings of the 2009 …, 2009 - dl.acm.org
One predominant barrier encountered in furthering research and development efforts aimed at facilitating resilient HPC applications is a substantial lack of existing reliability and …
PM Wells, K Chakraborty, GS Sohi - ACM SIGOPS Operating Systems …, 2008 - dl.acm.org
Future multicore processors will be more susceptible to a variety of hardware failures. In particular, intermittent faults, caused in part by manufacturing, thermal, and voltage …
This chapter presents a case study on how to characterize the resiliency of large-scale computers. The analysis focuses on the failures and errors of Blue Waters, the Cray hybrid …
High performance computing (HPC) systems frequently suffer errors and failures from hardware components that negatively impact the performance of jobs run on these systems …
X Fu, T Li, J Fortes - Workshop on modeling, benchmarking and …, 2006 - ecoms.ee.uh.edu
Semiconductor transient faults (soft errors) are becoming an increasingly critical threat to reliable software execution. With the advent of the billion transistor chip era, it is impractical …
Future multicore processors will be more susceptible to a variety of hardware failures. In particular, intermittent faults, caused in part by manufacturing, thermal, and voltage …