Prosper: Program Stack Persistence in Hybrid Memory Systems

KP Arun, D Mishra, B Panda - 2024 IEEE International …, 2024 - ieeexplore.ieee.org
A persistent and crash-consistent execution state is essential for systems to guarantee
resilience against power failures and abrupt system crashes. The availability of nonvolatile …

[PDF][PDF] Zero-overhead nvm crash resilience

F Nawab, D Chakrabarti, T Kelly… - Non-Volatile Memories …, 2015 - eecs.umich.edu
Byte-addressable non-volatile memory (NVM) allows finegrained in-place update of durable
data. NVM transaction mechanisms prevent failures during updates from corrupting data, but …

Checkpointing as a service in heterogeneous cloud environments

J Cao, M Simonin, G Cooperman… - 2015 15th IEEE/ACM …, 2015 - ieeexplore.ieee.org
A non-invasive, cloud-agnostic approach is demonstrated for extending existing cloud
platforms to include checkpoint-restart capability. Most cloud platforms currently rely on each …

A write-efficient and consistent hashing scheme for non-volatile memory

X Zhang, D Feng, Y Hua, J Chen, M Fu - Proceedings of the 47th …, 2018 - dl.acm.org
The development of non-volatile memory technologies (NVMs) has attracted interest in
designing data structures that are efficiently adapted to NVMs. In this context, several NVM …

Reliable and efficient parallel checkpointing framework for nonvolatile processor with concurrent peripherals

T Wu, K Ma, J Hu, J Xue, J Li, X Shi… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Intermittent systems powered by ambient energy harvesting are becoming popular for the
benefits of an infinite lifetime and minimum maintenance requirements. Nonvolatile …

Proposal of MPI operation level checkpoint/rollback and one implementation

Y Tang, GE Fagg, JJ Dongarra - Sixth IEEE International …, 2006 - ieeexplore.ieee.org
With the increasing number of processors in modern HPC (High Performance Computing)
systems, there are two emergent problems to solve. One is scalability, the other is fault …

Evaluating tradeoffs in granularity and overheads in supporting nonvolatile execution semantics

K Ma, MJ Liao, X Li, Z Huan… - 2017 18th International …, 2017 - ieeexplore.ieee.org
While pausing and resuming execution using nonvolatile storage has long been possible,
nonvolatile processing as a fundamental paradigm has only recently been made practical by …

Mojim: A reliable and highly-available non-volatile memory system

Y Zhang, J Yang, A Memaripour… - Proceedings of the …, 2015 - dl.acm.org
Next-generation non-volatile memories (NVMs) promise DRAM-like performance,
persistence, and high density. They can attach directly to processors to form non-volatile …

Crc-based memory reliability for task-parallel HPC applications

O Subasi, O Unsal, J Labarta, G Yalcin… - 2016 IEEE …, 2016 - ieeexplore.ieee.org
Memory reliability will be one of the major concerns for future HPC and Exascale systems.
This concern is mostly attributed to the expected massive increase in memory capacity and …

A method of self-adaptive pre-copy container checkpoint

X Chen, JH Jiang, Q Jiang - 2015 IEEE 21st Pacific Rim …, 2015 - ieeexplore.ieee.org
Container checkpoint is a kind of backward recovery fault tolerance technology, through
which the high availability of container can be achieved. Checkpoint downtime is the key …