High Performance Computing (HPC) applications. There are studies that address fail-stop
errors and studies that address SDCs. However few studies address both types of errors
together. In this paper we propose a software-based selective replication technique for HPC
applications for both fail-stop errors and SDCs. Since complete replication of applications
can be costly in terms of resources, we develop a runtime-based technique for selective …