On the optimum checkpoint selection problem
SIAM Journal on Computing
Computing Optimal Checkpointing Strategies for Rollback and Recovery Systems
IEEE Transactions on Computers - Fault-Tolerant Computing
Optimum checkpoints with age dependent failures
Acta Informatica
Comparative Analysis of Different Models of Checkpointing and Recovery
IEEE Transactions on Software Engineering
On the Optimal Checkpointing of Critical Tasks and Transaction-Oriented Systems
IEEE Transactions on Software Engineering
Checkpoint Space Reclamation for Uncoordinated Checkpointing in Message-Passing Systems.
IEEE Transactions on Parallel and Distributed Systems
Checkpointing in distributed computing systems
Journal of Parallel and Distributed Computing
An On-Line Algorithm for Checkpoint Placement
IEEE Transactions on Computers
Impact of Checkpoint Latency on Overhead Ratio of a Checkpointing Scheme
IEEE Transactions on Computers
An Analytical Model for Hybrid Checkpointing in Time Warp Distributed Simulation
IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
Staggered Consistent Checkpointing
IEEE Transactions on Parallel and Distributed Systems
Optimal policy for batch operations: backup, checkpointing, reorganization, and updating
ACM Transactions on Database Systems (TODS)
On the Optimum Checkpoint Interval
Journal of the ACM (JACM)
Performance analysis of checkpointing strategies
ACM Transactions on Computer Systems (TOCS)
Performance of rollback recovery systems under intermittent failures
Communications of the ACM
A first order approximation to the optimum checkpoint interval
Communications of the ACM
A Variational Calculus Approach to Optimal Checkpoint Placement
IEEE Transactions on Computers
Processor allocation and checkpoint interval selection in cluster computing systems
Journal of Parallel and Distributed Computing - Special issue on cluster and network-based computing
A survey of rollback-recovery protocols in message-passing systems
ACM Computing Surveys (CSUR)
Stochastic Models for Performance Analysis of Database Recovery Control
IEEE Transactions on Computers
Low-Latency, Concurrent Checkpointing for Parallel Programs
IEEE Transactions on Parallel and Distributed Systems
A longitudinal survey of Internet host reliability
SRDS '95 Proceedings of the 14TH Symposium on Reliable Distributed Systems
Availability Models with Age-Dependent Checkpointing
SRDS '02 Proceedings of the 21st IEEE Symposium on Reliable Distributed Systems
Evaluating Distributed Checkpointing Protocol
ICDCS '03 Proceedings of the 23rd International Conference on Distributed Computing Systems
A Dynamic Checkpointing Scheme Based on Reinforcement Learning
PRDC '04 Proceedings of the 10th IEEE Pacific Rim International Symposium on Dependable Computing (PRDC'04)
Distribution-Free Checkpoint Placement Algorithms Based on Min-Max Principle
IEEE Transactions on Dependable and Secure Computing
Optimal age-dependent checkpoint strategy with retry of rollback recovery
IWADS '02 Proceedings of the Autonomous Decentralized System, 2002. on The 2nd International Workshop
On Evaluating the Performability of Degradable Computing Systems
IEEE Transactions on Computers
Proactive management of software aging
IBM Journal of Research and Development
Analytic models for rollback and recovery strategies in data base systems
IEEE Transactions on Software Engineering
Environmental diversity techniques of software systems
FGIT'11 Proceedings of the Third international conference on Future Generation Information Technology
Checkpoint scheduling model for optimality
Information Processing Letters
Comparing checkpoint and rollback recovery schemes in a cluster system
ICA3PP'12 Proceedings of the 12th international conference on Algorithms and Architectures for Parallel Processing - Volume Part I
Hi-index | 0.00 |
This paper concerns sequential checkpoint placement problems under two dependability measures: steady-state system availability and expected reward per unit time in the steady state. We develop numerical computation algorithms to determine the optimal checkpoint sequence, based on the classical Brender's fixed point algorithm and further give three simple approximation methods. Numerical examples with the Weibull failure time distribution are devoted to illustrate quantitatively the overestimation and underestimation of the sub-optimal checkpoint sequences based on the approximation methods.