Libckpt: transparent checkpointing under Unix

  • Authors:
  • James S. Plank;Micah Beck;Gerry Kingsley;Kai Li

  • Affiliations:
  • University of Tennessee;University of Tennessee;University of Tennessee;Princeton University

  • Venue:
  • TCON'95 Proceedings of the USENIX 1995 Technical Conference Proceedings
  • Year:
  • 1995

Quantified Score

Hi-index 0.01

Visualization

Abstract

Checkpointing is a simple technique for rollback recovery: the state of an executing program is periodically saved to a disk file from which it can be recovered after a failure. While recent research has developed a collection of powerful techniques for minimizing the overhead of writing checkpoint files, checkpointing remains unavailable to most application developers. In this paper we describe libckpt, a portable checkpointing tool for Unix that implements all applicable performance optimizations which are reported in the literature. While libckpt can be used in a mode which is almost totally transparent to the programmer, it also supports the incorporation of user directives into the creation of checkpoints. This user-directed checkpointing is an innovation which is unique to our work.