Optimistic recovery in distributed systems
ACM Transactions on Computer Systems (TOCS)
ACM Transactions on Computer Systems (TOCS)
Efficient distributed recovery using message logging
Proceedings of the eighth annual ACM Symposium on Principles of distributed computing
Recovery in distributed systems using optimistic message logging and check-pointing
Journal of Algorithms
Manetho: Transparent Roll Back-Recovery with Low Overhead, Limited Rollback, and Fast Output Commit
IEEE Transactions on Computers - Special issue on fault-tolerant computing
Progressive Retry for Software Failure Recovery in Message-Passing Applications
IEEE Transactions on Computers
A message system supporting fault tolerance
SOSP '83 Proceedings of the ninth ACM symposium on Operating systems principles
Publishing: a reliable broadcast communication mechanism
SOSP '83 Proceedings of the ninth ACM symposium on Operating systems principles
Checkpointing and Its Applications
FTCS '95 Proceedings of the Twenty-Fifth International Symposium on Fault-Tolerant Computing
Message logging: pessimistic, optimistic, and causal
ICDCS '95 Proceedings of the 15th International Conference on Distributed Computing Systems
Libckpt: transparent checkpointing under Unix
TCON'95 Proceedings of the USENIX 1995 Technical Conference Proceedings
Consistent Global Checkpoints that Contain a Given Set of Local Checkpoints
IEEE Transactions on Computers
Efficient transparent application recovery in client-server information systems
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Support for Software Interrupts in Log-Based Rollback-Recovery
IEEE Transactions on Computers
Fast cluster failover using virtual memory-mapped communication
ICS '99 Proceedings of the 13th international conference on Supercomputing
A survey of rollback-recovery protocols in message-passing systems
ACM Computing Surveys (CSUR)
Supporting nondeterministic execution in fault-tolerant systems
FTCS '96 Proceedings of the The Twenty-Sixth Annual International Symposium on Fault-Tolerant Computing (FTCS '96)
Distributed recovery with K-optimistic logging
Journal of Parallel and Distributed Computing
NT-SwiFT: software implemented fault tolerance on Windows NT
Journal of Systems and Software
Recovery guarantees for Internet applications
ACM Transactions on Internet Technology (TOIT)
Checkpointing for Peta-Scale Systems: A Look into the Future of Practical Rollback-Recovery
IEEE Transactions on Dependable and Secure Computing
Flashback: a lightweight extension for rollback and deterministic replay for software debugging
ATEC '04 Proceedings of the annual conference on USENIX Annual Technical Conference
NT-SwiFT: software implemented fault tolerance on windows NT
WINSYM'98 Proceedings of the 2nd conference on USENIX Windows NT Symposium - Volume 2
Unstoppable stateful PHP web services
WISE'06 Proceedings of the 7th international conference on Web Information Systems
Hi-index | 0.01 |
Abstract: Much of the literature on message logging and checkpointing in the past decade has been based on a so-called optimistic approach that places more emphasis on failure-free overhead than recovery efficiency. Our experience has shown that most telecommunications systems use a pessimistic approach because the main purpose of using message logging and checkpointing is to achieve fast and localized recovery, and the failure-free overhead of a pessimistic approach can often be made reasonably low by exploiting application-specific information.