On coordinated checkpointing in distributed systems

G Cao, M Singhal - IEEE Transactions on Parallel and …, 1998 - ieeexplore.ieee.org
Coordinated checkpointing simplifies failure recovery and eliminates domino effects in case
of failures by preserving a consistent global checkpoint on stable storage. However, the …

[PDF][PDF] Identification of Critical Factors in Checkpointing Based Multiple Fault Tolerance for Distributed System

S Bansal, S Sharma - Journal of Emerging Trends in Computing and …, 2010 - Citeseer
Performance of a checkpointing based multiple fault tolerance is low. The main reason is
overheads associate with checkpointing. A checkpointing algorithm can be improved by …

[PDF][PDF] A comparison between different checkpoint schemes with advantages and disadvantages

M Kumar, A Choudhary, V Kumar - Int J Comput Appl Nat Semin …, 2014 - academia.edu
It is known that check pointing and rollback recovery are widely used techniques that allow a
distributed computing to progress in spite of a failure. There are two fundamental …

Using computing checkpoints implement consistent low-cost non-blocking coordinated checkpointing

C Men, X Yang - International Conference on Parallel and Distributed …, 2004 - Springer
Two approaches are used to reduce the overhead associated with coordinated
checkpointing: one is to reduce the number of synchronization messages and the number of …

A fully informed model-based checkpointing protocol for preventing useless checkpoints

J Wu, D Manivannan - International Journal of Parallel, Emergent …, 2013 - Taylor & Francis
Checkpointing and rollback recovery are widely used techniques for handling failures in
distributed systems. When processes involved in a distributed computation are allowed to …

Checkpointing in distributed computing systems

L Kumar, M Mishra, RC Joshi - Concurrency in Dependable Computing, 2002 - Springer
In this chapter, we present a message optimal non-intrusive checkpointing protocol for
nondeterministic message passing distributed computing systems that does not require …

An experimental evaluation of coordinated checkpointing in a parallel machine

LM Silva, JG Silva - European Dependable Computing Conference, 1999 - Springer
Coordinated checkpointing represents a very effective solution to assure the continuity of
distributed and parallel applications in the occurrence of failures. In previous studies it has …

[PDF][PDF] A Study of Mutable Checkpointing Approach to Reduce Overheads Associated with Coordinated Checkpointing

A Chaturvedi, SS Hussain… - The SIJ Transactions …, 2013 - pdfs.semanticscholar.org
As because of the new issues in mobile computing such as: lack of stable storage, low
bandwidth of wireless channels, high mobility and limited battery life. So, coordinated check …

[PDF][PDF] On consistent checkpointing in distributed systems

G Cao, M Singhal - 1997 - Citeseer
Consistent checkpointing simpli es failure recovery and eliminates the domino e ect in case
of failure by preserving a consistent global checkpoint on the stable storage. However, the …

Checkpointing distributed computing systems: An optimisation approach

H Mansouri, ASK Pathan - International Journal of High …, 2019 - inderscienceonline.com
The intent of this paper is to propose an optimisation approach for a new coordinator
blocking type checkpointing algorithm to ensure reliability and fault tolerance in distributed …