Fault tolerance of MPI applications in exascale systems: The ULFM solution N Losada, P González, MJ Martín, G Bosilca, A Bouteiller, K Teranishi Future Generation Computer Systems 106, 467-481, 2020 | 53 | 2020 |
Resilient MPI applications using an application-level checkpointing framework and ULFM N Losada, I Cores, MJ Martín, P González The Journal of Supercomputing 73, 100-113, 2017 | 38 | 2017 |
Local rollback for resilient MPI applications with application-level checkpointing and message logging N Losada, G Bosilca, A Bouteiller, P González, MJ Martín Future Generation Computer Systems 91, 450-464, 2019 | 32 | 2019 |
Portable Application-level Checkpointing for Hybrid MPI-OpenMP Applications N Losada, MJ Martín, G Rodríguez, P González Procedia Computer Science 80, 19-29, 2016 | 14 | 2016 |
Assessing resilient versus stop-and-restart fault-tolerant solutions in MPI applications N Losada, MJ Martín, P González The Journal of Supercomputing 73, 316-329, 2017 | 12 | 2017 |
A portable and adaptable fault tolerance solution for heterogeneous applications N Losada, BB Fraguela, P González, MJ Martín Journal of Parallel and Distributed Computing 104, 146-158, 2017 | 6 | 2017 |
Extending an Application-Level Checkpointing Tool to Provide Fault Tolerance Support to OpenMP Applications N Losada, MJ Martín, G Rodríguez, P González Journal of Universal Computer Science 20 (9), 1352-1372, 2014 | 6 | 2014 |
I/O optimization in the checkpointing of OpenMP parallel applications N Losada, MJ Martín, G Rodríguez, P González 2015 23rd Euromicro International Conference on Parallel, Distributed, and …, 2015 | 5 | 2015 |
Evaluating Data Redistribution in PaRSEC Q Cao, G Bosilca, N Losada, W Wu, D Zhong, J Dongarra IEEE Transactions on Parallel and Distributed Systems 33 (8), 1856-1872, 2021 | 4 | 2021 |
Asynchronous Receiver-Driven Replay for Local Rollback of MPI Applications N Losada, A Bouteiller, G Bosilca 2019 IEEE/ACM 9th Workshop on Fault Tolerance for HPC at eXtreme Scale (FTXS …, 2019 | 3 | 2019 |
Application-level Fault Tolerance and Resilience in HPC Applications N Losada Universidade da Coruña, 2018 | 1 | 2018 |
Towards Ad Hoc Recovery For Soft Errors N Losada, L Bautista-Gomez, K Keller, O Unsal 2018 IEEE/ACM 8th Workshop on Fault Tolerance for HPC at eXtreme Scale (FTXS …, 2018 | 1 | 2018 |
Insights into Application-level Solutions towards Resilient MPI Applications P González, N Losada, MJ Martín 2018 International Conference on High Performance Computing & Simulation …, 2018 | 1 | 2018 |
Stop&Restart vs Resilient MPI applications N Losada, MJ Martín, P González 16th International Conference on Computational and Mathematical Methods in …, 2016 | 1 | 2016 |
Towards resilience in MPI applications using an application-level checkpointing framework N Losada, I Cores, MJ Martín, P González 15th International Conference on Computational and Mathematical Methods in …, 2015 | 1 | 2015 |
Resilience of Parallel Applications N Losada, MJ Martín, P González NESUS Winter School & PhD Symposium 2016, Network For Sustainable Ultrascale …, 2016 | | 2016 |