X Yang, Y Du, P Wang, H Fu, J Jia - IEEE Transactions on Parallel and …, 2009 - dl.acm.org
As the size of large-scale computer systems increases, their mean-time-between-failures are
becoming significantly shorter than the execution time of many current scientific applications …