On the optimum checkpoint selection problem
SIAM Journal on Computing
Computing Optimal Checkpointing Strategies for Rollback and Recovery Systems
IEEE Transactions on Computers - Fault-Tolerant Computing
Comparative Analysis of Different Models of Checkpointing and Recovery
IEEE Transactions on Software Engineering
Compiler-assisted full checkpointing
Software—Practice & Experience
Analysis and performance optimization of checkpointing schemes with task duplication
Analysis and performance optimization of checkpointing schemes with task duplication
On the Optimum Checkpoint Interval
Journal of the ACM (JACM)
Stabilizing Pre-Run-Time Schedules With the Help of GraceTime
Real-Time Systems
An Adaptive Checkpointing Protocol to Bound Recovery Time with Message Logging
SRDS '99 Proceedings of the 18th IEEE Symposium on Reliable Distributed Systems
Task Feasibility Analysis and Dynamic Voltage Scaling in Fault-Tolerant Real-Time Embedded Systems
Proceedings of the conference on Design, automation and test in Europe - Volume 2
Dynamic adaptation for fault tolerance and power management in embedded real-time systems
ACM Transactions on Embedded Computing Systems (TECS)
Energy-Aware Fault Tolerance in Fixed-Priority Real-Time Embedded Systems
Proceedings of the 2003 IEEE/ACM international conference on Computer-aided design
Energy-Aware Adaptive Checkpointing in Embedded Real-Time Systems
DATE '03 Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
Space-efficient page-level incremental checkpointing
Proceedings of the 2005 ACM symposium on Applied computing
Distribution-Free Checkpoint Placement Algorithms Based on Min-Max Principle
IEEE Transactions on Dependable and Secure Computing
Adaptive page-level incremental checkpointing based on expected recovery time
Proceedings of the 2006 ACM symposium on Applied computing
Online task-scheduling for fault-tolerant low-energy real-time systems
Proceedings of the 2006 IEEE/ACM international conference on Computer-aided design
Implementing fault-tolerance in real-time programs by automatic program transformations
ACM Transactions on Embedded Computing Systems (TECS)
Numerical computation algorithms for sequential checkpoint placement
Performance Evaluation
Analysis of a software system with rejuvenation, restoration and checkpointing
ISAS'08 Proceedings of the 5th international conference on Service availability
File fragmentation over an unreliable channel
INFOCOM'10 Proceedings of the 29th conference on Information communications
Journal of Systems and Software
Approximately uniform online checkpointing
COCOON'11 Proceedings of the 17th annual international conference on Computing and combinatorics
Checkpointing for the reliability of real-time systems with on-line fault detection
EUC'05 Proceedings of the 2005 international conference on Embedded and Ubiquitous Computing
On the checkpointing strategy in desktop grids
IDCS'12 Proceedings of the 5th international conference on Internet and Distributed Computing Systems
Proceedings of the 2013 International Conference on Compilers, Architectures and Synthesis for Embedded Systems
A policy-based approach for strong mobility of composed Web services
Service Oriented Computing and Applications
Hi-index | 14.98 |
Checkpointing enables us to reduce the time to recover from a fault by saving intermediate states of the program in a reliable storage. The length of the intervals between checkpoints affects the execution time of programs. On one hand, long intervals lead to long reprocessing time, while, on the other hand, too frequent checkpointing leads to high checkpointing overhead. In this paper, we present an on-line algorithm for placement of checkpoints. The algorithm uses knowledge of the current cost of a checkpoint when it decides whether or not to place a checkpoint. The total overhead of the execution time when the proposed algorithm is used is smaller than the overhead when fixed intervals are used. Although the proposed algorithm uses only on-line knowledge about the cost of checkpointing, its behavior is close to the off-line optimal algorithm that uses a complete knowledge of checkpointing cost.