2-step algorithm for enhancing effectiveness of sender-based message logging

Authors:
Jinho Ahn
Affiliations:
Kyonggi University, Yiuidong, Yeongtonggu, Suwonsi Kyonggido, Korea
Venue:
SpringSim '07 Proceedings of the 2007 spring simulation multiconference - Volume 2
Year:
2007

Citing 10
Cited 4

Optimistic recovery in distributed systems

ACM Transactions on Computer Systems (TOCS)
Distributed snapshots: determining global states of distributed systems

ACM Transactions on Computer Systems (TOCS)
Fail-stop processors: an approach to designing fault-tolerant computing systems

ACM Transactions on Computer Systems (TOCS)
Time, clocks, and the ordering of events in a distributed system

Communications of the ACM
A survey of rollback-recovery protocols in message-passing systems

ACM Computing Surveys (CSUR)
Message Logging in Mobile Computing

FTCS '99 Proceedings of the Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing
Publishing: a reliable broadcast communication mechanism

SOSP '83 Proceedings of the ninth ACM symposium on Operating systems principles
Sender-based message logging for reducing rollback propagation

SPDP '95 Proceedings of the 7th IEEE Symposium on Parallel and Distributeed Processing
The development of an efficient checkpointing facility exploiting operating systems services of the GENESIS cluster operating system

Future Generation Computer Systems - Special issue: Advanced services for clusters and internet computing
MPICH-V2: a Fault Tolerant MPI for Volatile Nodes based on Pessimistic Sender Based Message Logging

Proceedings of the 2003 ACM/IEEE conference on Supercomputing

Evaluating the viability of process replication reliability for exascale systems

Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Evaluating operating system vulnerability to memory errors

Proceedings of the 2nd International Workshop on Runtime and Operating Systems for Supercomputers
The viability of using compression to decrease message log sizes

Euro-Par'12 Proceedings of the 18th international conference on Parallel processing workshops
Evaluating energy savings for checkpoint/restart

E2SC '13 Proceedings of the 1st International Workshop on Energy Efficient Supercomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Sender-based message logging allows each message to be logged in the volatile storage of its corresponding sender. This behavior avoids logging messages on the stable storage and results in lower failure-free overhead than receiver-based message logging. However, in the message logging approach, each process should keep in its limited volatile storage the log information of its sent messages for recovering their receivers. In this paper, we propose a 2-step algorithm to efficiently remove logged messages from the volatile storage while ensuring the consistent recovery of the system in case of process failures. As the first step, the algorithm eliminates useless log information in the volatile storage with no extra message and forced checkpoint. But, even if the step has been performed, the more empty buffer space for logging messages in future may be required. In this case, the second step forces the useful log information to become useless by maintaining a vector to record the size of the information for every other process. This behavior incurs fewer additional messages and forced checkpoints than existing algorithms.