SFT: a consistent checkpointing algorithm with shorter freezing time

  • Authors:
  • Xiaohui Wei;Jiubin Ju

  • Affiliations:
  • Jilin Univ., Changchun, China;Jilin Univ., Changchun, China

  • Venue:
  • ACM SIGOPS Operating Systems Review
  • Year:
  • 1998

Quantified Score

Hi-index 0.00

Visualization

Abstract

SFT algorithm, a consistent checkpointing algorithm with shorter freezing time, is presented in this paper. SFT is able to implement fault-tolerance in distributed systems. The features of the algorithm include shorter freezing time, lower overhead, and simple roll backing. To reduce checkpointing time, a special control message (Munblock) is used to ensure that at any given time a process can respond the checkpoint event quickly. Moreover, a main memory algorithm is used to improve concurrency of checkpointing. By using SFT algorithm, the freezing time resulted by checkpointing is less than 0.03s. The control message number of SFT is only O (n).