T Martsinkevich, O Subasi, O Unsal… - IEEE International …, 2015 - experts.illinois.edu
We present a fault-tolerant protocol for task-parallel message-passing applications to
mitigate transient errors. The protocol requires the restart only of the task that experienced …