An efficient computing-checkpoint based coordinated checkpoint algorithm

  • Authors:
  • Men Chaoguang;Wang Dongsheng;Zhao Yunlong

  • Affiliations:
  • Department of Computer Science and Technology, Tsinghua University, Beijing, P.R. China;Department of Computer Science and Technology, Tsinghua University, Beijing, P.R. China;Research Center of High Dependability Computing Technology, Harbin Engineering University, Harbin, Heilongjiang, P.R. China

  • Venue:
  • EUC'06 Proceedings of the 2006 international conference on Embedded and Ubiquitous Computing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, the concept of “computing checkpoint” is introduced, and then an efficient coordinated checkpoint algorithm is proposed. The algorithm combines the two approaches of reducing the overhead associated with coordinated checkpointing, which one is to minimize the processes which take checkpoints and the other is to make the checkpointing process non-blocking. Through piggybacking the information including which processes have taken new checkpoint in the broadcast committing message, the checkpoint sequence number of every process can be kept consistent in all processes, so that the unnecessary checkpoints and orphan messages can be avoided in the future running. Evaluation result shows that the number of redundant computing checkpoints is less than 1/10 of the number of tentative checkpoints. Analyses and experiments show that the overhead of our algorithm is lower than that of other coordinated checkpoint algorithms