Distributed snapshots: determining global states of distributed systems
ACM Transactions on Computer Systems (TOCS)
Evolving algebras 1993: Lipari guide
Specification and validation methods
Dynamite - Blasting Obstacles to Parallel Cluster Computing
HPCN Europe '99 Proceedings of the 7th International Conference on High-Performance Computing and Networking
High Level System Design and Analysis Using Abstract State Machines
FM-Trends 98 Proceedings of the International Workshop on Current Trends in Applied Formal Method: Applied Formal Methods
Abstract State Machines: A Method for High-Level System Design and Analysis
Abstract State Machines: A Method for High-Level System Design and Analysis
GridDemo: International Workshop on Live Demonstrations of Grid Technologies and Applications
CCGRID '02 Proceedings of the 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid
The Anatomy of the Grid: Enabling Scalable Virtual Organizations
International Journal of High Performance Computing Applications
Performance evaluation of an application-level checkpointing solution on grids
Future Generation Computer Systems
Future Generation Computer Systems
Hi-index | 0.00 |
This paper introduces a combination of the existing parallel checkpointing techniques for software heterogeneous ClusterGrid infrastructures. Most of the existing solutions are aiming at supporting application transparency (no checkpoint related code development in application), but some others build middleware transparent (no service modification) solutions. The main contribution of this paper is to introduce a solution providing both application and middleware transparency at the same time. Compatibility and integrity requirements are identified and corresponding conditions are established using Abstract State Machines. The most relevant checkpointing systems are checked against the conditions in order to examine their conformity. Based on the conditions, a novel checkpointing method is defined and a proof of concept checkpointing tool, called TotalCheckpoint (TCKPT) is introduced.