Optimistic recovery in distributed systems
ACM Transactions on Computer Systems (TOCS)
Checkpointing and Rollback-Recovery for Distributed Systems
IEEE Transactions on Software Engineering - Special issue on distributed systems
Paradigms for process interaction in distributed programs
ACM Computing Surveys (CSUR)
Manetho: Transparent Roll Back-Recovery with Low Overhead, Limited Rollback, and Fast Output Commit
IEEE Transactions on Computers - Special issue on fault-tolerant computing
Distributed snapshots: determining global states of distributed systems
ACM Transactions on Computer Systems (TOCS)
Trade-offs in implementing causal message logging protocols
PODC '96 Proceedings of the fifteenth annual ACM symposium on Principles of distributed computing
Optimistic Crash Recovery without Changing Application Messages
IEEE Transactions on Parallel and Distributed Systems
Fail-stop processors: an approach to designing fault-tolerant computing systems
ACM Transactions on Computer Systems (TOCS)
Time, clocks, and the ordering of events in a distributed system
Communications of the ACM
Message Logging: Pessimistic, Optimistic, Causal, and Optimal
IEEE Transactions on Software Engineering
Message Logging in Mobile Computing
FTCS '99 Proceedings of the Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing
Publishing: a reliable broadcast communication mechanism
SOSP '83 Proceedings of the ninth ACM symposium on Operating systems principles
How to recover efficiently and asynchronously when optimism fails
ICDCS '96 Proceedings of the 16th International Conference on Distributed Computing Systems (ICDCS '96)
Completely Asynchronous Optimistic Recovery with Minimal Rollbacks
FTCS '95 Proceedings of the Twenty-Fifth International Symposium on Fault-Tolerant Computing
Hi-index | 0.00 |
This paper presents three garbage collection schemes for causal message logging with independent checkpointing. The first scheme allows each process to autonomously remove useless log information in its volatile storage by piggybacking only some additional information without requiring any extra message and forced checkpoint. Additionally, it supports faster output commit than traditional schemes. The second scheme enables each process to remove a part of log information in the storage if more empty space is required. It reduces the number of processes participating in the garbage collection by using the size of the log information of each process. The third scheme is a hybrid scheme having the advantages of the two proposed schemes. Simulation results show that the third scheme significantly reduces the garbage collection overhead compared with the traditional schemes regardless of specific communication patterns of distributed applications.