Controller/Precompiler for Portable Checkpointing

  • Authors:
  • Gabriel RodrÍguez;María J. MartÍn;Patricia GonzÁlez;Juan TouriÑo

  • Affiliations:
  • The authors are with Computer Architecture Group, Department of Electronics and Systems, University of A Coruòa, Spain. E-mail: grodriguez@udc.es;The authors are with Computer Architecture Group, Department of Electronics and Systems, University of A Coruòa, Spain. E-mail: grodriguez@udc.es;The authors are with Computer Architecture Group, Department of Electronics and Systems, University of A Coruòa, Spain. E-mail: grodriguez@udc.es;The authors are with Computer Architecture Group, Department of Electronics and Systems, University of A Coruòa, Spain. E-mail: grodriguez@udc.es

  • Venue:
  • IEICE - Transactions on Information and Systems
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents CPPC (Controller/Precompiler for Portable Checkpointing), a checkpointing tool designed for heterogeneous clusters and Grid infrastructures through the use of portable protocols, portable checkpoint files and portable code. It works at variable level being user-directed, thus generating small checkpoint files. It allows parallel processes to checkpoint independently, without runtime coordination or message-logging. Consistency is achieved at restart time by negotiating the restart point. A directive-based checkpointing precompiler has also been implemented to ease up user's effort. CPPC was designed to work with parallel MPI programs, though it can be used with sequential ones, and easily extended to parallel programs written using different message-passing libraries, due to its highly modular design. Experimental results are shown using CPPC with different test applications.