Application and middleware transparent checkpointing with TCKPT on ClusterGrids

  • Authors:
  • József Kovács;Peter Kacsuk;Radoslaw Januszewski;Gracjan Jankowski

  • Affiliations:
  • MTA SZTAKI, Parallel and Distributed Systems Laboratory, 1111 Budapest, Kende 13-17, Hungary;MTA SZTAKI, Parallel and Distributed Systems Laboratory, 1111 Budapest, Kende 13-17, Hungary;Poznan Supercomputing and Networking Center, 61-704 Poznan, Noskowskiego 12/14, Poland;Poznan Supercomputing and Networking Center, 61-704 Poznan, Noskowskiego 12/14, Poland

  • Venue:
  • Future Generation Computer Systems
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper introduces a combination of the existing parallel checkpointing techniques for software heterogeneous ClusterGrid infrastructures. Most of the existing solutions are aiming at supporting application transparency (no checkpoint related code development in application), but some others build middleware transparent (no service modification) solutions. The main contribution of this paper is to introduce a solution providing both application and middleware transparency at the same time. Compatibility and integrity requirements are identified and corresponding conditions are established using Abstract State Machines. The most relevant checkpointing systems are checked against the conditions in order to examine their conformity. Based on the conditions, a novel checkpointing method is defined and a proof of concept checkpointing tool, called TotalCheckpoint (TCKPT) is introduced.