A survey of rollback-recovery protocols in message-passing systems

EN Elnozahy, L Alvisi, YM Wang… - ACM Computing Surveys …, 2002 - dl.acm.org
This survey covers rollback-recovery techniques that do not require special language
constructs. In the first part of the survey we classify rollback-recovery protocols into …

[图书][B] Elements of distributed computing

VK Garg - 2002 - books.google.com
A lucid and up-to-date introduction to the fundamentals of distributed computing systems As
distributed systems become increasingly available, the need for a fundamental discussion of …

An efficient optimistic message logging scheme for recoverable mobile computing systems

T Park, N Woo, HY Yeom - IEEE Transactions on mobile …, 2002 - ieeexplore.ieee.org
A number of checkpointing and message logging algorithms have been proposed to support
fault tolerance of mobile computing systems. However, little attention has been paid to the …

Systems, methods, and computer-readable media for using immutable and copy-on-write data semantics to optimize record and replay frameworks

M Marron - US Patent 10,268,567, 2019 - Google Patents
Systems, methods, and computer-readable media are dis closed for using managed runtime
environment semantics to optimize record and replay frameworks. One method includes …

An efficient recovery scheme for mobile computing environments

T Park, N Woo, HY Yeom - Proceedings. Eighth International …, 2001 - ieeexplore.ieee.org
This paper presents an efficient recovery scheme based on checkpointing and message
logging for mobile computing systems. For the efficient management of checkpoints and …

Fine: A fully informed and efficient communication-induced checkpointing protocol for distributed systems

Y Luo, D Manivannan - Journal of Parallel and Distributed Computing, 2009 - Elsevier
Communication-Induced Checkpointing (CIC) protocols are classified into two categories in
the literature: Index-based and Model-based. In this paper, we discuss two data structures …

An asynchronous recovery scheme based on optimistic message logging for mobile computing systems

T Park, HY Yeom - Proceedings 20th IEEE International …, 2000 - ieeexplore.ieee.org
This paper presents an asynchronous recovery scheme to provide fault-tolerance for mobile
computing systems. The proposed scheme is based on optimistic message logging, since …

[PDF][PDF] Fault-tolerant distributed simulation

OP Damani, VK Garg - ACM SIGSIM Simulation Digest, 1998 - dl.acm.org
In traditional distributed simulation schemes, entire simulation needs to be restarted if any of
the participating LP crashes. This is highly undesirable for long running simulations. Some …

An efficient recovery scheme for fault-tolerant mobile computing systems

T Park, N Woo, HY Yeom - Future Generation Computer Systems, 2003 - Elsevier
This paper presents an efficient recovery scheme to provide fault-tolerance for the mobile
computing systems. The proposed scheme is based on the message logging and the …

Optimistic distributed simulation based on transitive dependency tracking

OP Damani, VK Garg, YM Wang - US Patent 6,031,987, 2000 - Google Patents
An optimistic distributed simulation method applicable to event-driven simulation that
requires only a single rollback announcement per straggler message, with no need for other …