Comparative Analysis of Different Models of Checkpointing and Recovery
IEEE Transactions on Software Engineering
On the Optimum Checkpoint Interval
Journal of the ACM (JACM)
Performance of rollback recovery systems under intermittent failures
Communications of the ACM
Probability and Statistics with Reliability, Queuing and Computer Science Applications
Probability and Statistics with Reliability, Queuing and Computer Science Applications
A Model of Checkpointing and Recovery with a Specified Number of Transactions between Checkpoints
Performance '83 Proceedings of the 9th International Symposium on Computer Performance Modelling, Measurement and Evaluation
Spectral Expansion Solutions for Markov-Modulated Queues
Performance Evaluation of Complex Systems: Techniques and Tools, Performance 2002, Tutorial Lectures
Hi-index | 0.00 |
We consider a system where transactions are processed by a single server subject to faults and recovery. A checkpoint is attempted after a fixed number of transactions have been completed, and takes some time to establish. The occurrence of a fault causes a rollback to the last checkpoint, after which all intervening transactions are reprocessed. The system is modelled by a two-dimensional Markov process with one unbounded variable (the number of transactions in the queue), and one bounded variable (the number of transactions processed since the last checkpoint). The joint steady-state distribution of the process, and hence the performance measures of interest, is found by two different methods: generating functions and spectral expansion. The problem of determining the optimal checkpointing parameter is considered.