Mutable Checkpoints: A New Checkpointing Approach for Mobile Computing Systems
IEEE Transactions on Parallel and Distributed Systems
Rollback-dependency trackability: a minimal characterization and its protocol
Information and Computation
The Cost of Recovery in Message Logging Protocols
IEEE Transactions on Knowledge and Data Engineering
Optimal Checkpoint Interval Analysis Using Stochastic Petri Net
PRDC '01 Proceedings of the 2001 Pacific Rim International Symposium on Dependable Computing
A New Approach for High Performance Computing Systems with Various Checkpointing Schemes
The Journal of Supercomputing
Hi-index | 0.00 |
High performance and reliability are the main goals of parallel and distributed computing systems. To increase the performance and reliability of the systems, various checkpoint schemes have been proposed in the literature for decades. However, the lack of general analytical models has been an obstacle to compare the performance of systems employing different checkpoint schemes. This paper develops an analytical model to evaluate the relative response time of systems employing checkpoint schemes. The model has been applied to evaluate the relative response time of systems employing RFC (Roll-Forward Checkpoint), DMR-F (Double Modular Redundancy for Forward recovery), and DST (Duplex with Self-Test) schemes. The result shows the feasibility of the model developed in the paper.