Optimistic recovery in distributed systems
ACM Transactions on Computer Systems (TOCS)
Checkpointing and Rollback-Recovery for Distributed Systems
IEEE Transactions on Software Engineering - Special issue on distributed systems
Distributed snapshots: determining global states of distributed systems
ACM Transactions on Computer Systems (TOCS)
Fail-stop processors: an approach to designing fault-tolerant computing systems
ACM Transactions on Computer Systems (TOCS)
Time, clocks, and the ordering of events in a distributed system
Communications of the ACM
Checkpointing distributed applications on mobile computers
PDIS '94 Proceedings of the third international conference on on Parallel and distributed information systems
A survey of rollback-recovery protocols in message-passing systems
ACM Computing Surveys (CSUR)
An Efficient Protocol for Checkpointing Recovery in Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
SAINT '02 Proceedings of the 2002 Symposium on Applications and the Internet
Message Logging in Mobile Computing
FTCS '99 Proceedings of the Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing
Publishing: a reliable broadcast communication mechanism
SOSP '83 Proceedings of the ninth ACM symposium on Operating systems principles
Sender-based message logging for reducing rollback propagation
SPDP '95 Proceedings of the 7th IEEE Symposium on Parallel and Distributeed Processing
A low-overhead recovery technique using quasi-synchronous checkpointing
ICDCS '96 Proceedings of the 16th International Conference on Distributed Computing Systems (ICDCS '96)
Future Generation Computer Systems - Special issue: Advanced services for clusters and internet computing
MPICH-V2: a Fault Tolerant MPI for Volatile Nodes based on Pessimistic Sender Based Message Logging
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Agent-Based Design of Load Balancing System for RFID Middlewares
FTDCS '07 Proceedings of the 11th IEEE International Workshop on Future Trends of Distributed Computing Systems
Integration of a parallel algorithm with a cluster grid for an industrial framework
SEPADS'08 Proceedings of the 7th WSEAS International Conference on Software Engineering, Parallel and Distributed Systems
Effective service replication mechanisms exploiting agent mobility
SEPADS'08 Proceedings of the 7th WSEAS International Conference on Software Engineering, Parallel and Distributed Systems
Mobile data collection in sensor networks: The TinyLime middleware
Pervasive and Mobile Computing
A comparison of load balancing techniques for scalable Web servers
IEEE Network: The Magazine of Global Internetworking
Hi-index | 0.00 |
Sender-based message logging allows each message to be logged in the volatile storage of its corresponding sender. This behavior avoids logging messages on the stable storage synchronously and results in lower failure-free overhead than receiver-based message logging. However, in the first approach, each process should keep in its limited volatile storage the log information of its sent messages for recovering their receivers. In this paper, we propose a 2-step algorithm to efficiently remove logged messages from the volatile storage while ensuring the consistent recovery of the system in case of process failures. As the first step, the algorithm eliminates useless log information in the volatile storage with no extra message and forced checkpoint. But, even if the step has been performed, the more empty buffer space for logging messages in future may be required. In this case, the second step forces the useful log information to become useless by maintaining a vector to record the size of the information for every other process. This behavior incurs fewer additional messages and forced checkpoints than existing algorithms. Experimental results verify that our algorithm significantly performs better than the traditional one with respect to the garbage collection overhead.