Integrating inter-node communication with a resilient asynchronous many-task runtime system

SR Paul, A Hayashi, M Whitlock, S Bak… - 2020 Workshop on …, 2020 - ieeexplore.ieee.org
Achieving fault tolerance is one of the significant challenges of exascale computing due to
projected increases in soft/transient failures. While past work on software-based resilience …

Integrating Inter-Node Communication with a Resilient Asynchronous Many-Task Runtime System

SR Paul, A Hayashi, M Whitlock, S Bak… - 2020 Workshop on …, 2020 - computer.org
Achieving fault tolerance is one of the significant challenges of exascale computing due to
projected increases in soft/transient failures. While past work on software-based resilience …

[PDF][PDF] Integrating Inter-Node Communication with a Resilient Asynchronous Many-Task Runtime System.

SR Paul, AK Hayashi, MJ Whitlock, S Bak, K Teranishi… - 2020 - osti.gov
Achieving fault tolerance is one of the significant challenges of exascale computing due to
projected increases in soft/transient failures. While past work on software-based resilience …

[PDF][PDF] Integrating Inter-Node Communication with a Resilient Asynchronous Many-Task Runtime System.

SR Paul, AK Hayashi, MJ Whitlock, S Bak, K Teranishi… - 2020 - osti.gov
Integrating Inter-Node Communication with a Resilient Asynchronous Many-Task Runtime
System Page 1 Integrating Inter-Node Communication with a Resilient Asynchronous Many-Task …