On the optimum checkpoint selection problem
SIAM Journal on Computing
Computing Optimal Checkpointing Strategies for Rollback and Recovery Systems
IEEE Transactions on Computers - Fault-Tolerant Computing
Optimum checkpoints with age dependent failures
Acta Informatica
Comparative Analysis of Different Models of Checkpointing and Recovery
IEEE Transactions on Software Engineering
On the Optimal Checkpointing of Critical Tasks and Transaction-Oriented Systems
IEEE Transactions on Software Engineering
Optimal checkpointing policies using the checkpointing density
Journal of Information Processing
Minimizing completion time of a program by checkpointing and rejuvenation
Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Optimal software rejuvenation for tolerating soft failures
Performance Evaluation
An On-Line Algorithm for Checkpoint Placement
IEEE Transactions on Computers
Impact of Checkpoint Latency on Overhead Ratio of a Checkpointing Scheme
IEEE Transactions on Computers
Analysis of Preventive Maintenance in Transactions Based Software Systems
IEEE Transactions on Computers
On-board preventive maintenance: a design-oriented analytic study for long-life applications
IPDS '98 Proceedings of the third IEEE international performance and dependability symposium on International performance and dependability symposium
On the Optimum Checkpoint Interval
Journal of the ACM (JACM)
Performance of rollback recovery systems under intermittent failures
Communications of the ACM
A first order approximation to the optimum checkpoint interval
Communications of the ACM
Analysis and implementation of software rejuvenation in cluster systems
Proceedings of the 2001 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
A Variational Calculus Approach to Optimal Checkpoint Placement
IEEE Transactions on Computers
Fine grained software degradation models for optimal rejuvenation policies
Performance Evaluation
Monitoring Smoothly Degrading Systems for Increased Dependability
Empirical Software Engineering
Stochastic Models for Performance Analysis of Database Recovery Control
IEEE Transactions on Computers
PNPM '99 Proceedings of the The 8th International Workshop on Petri Nets and Performance Models
Availability Models with Age-Dependent Checkpointing
SRDS '02 Proceedings of the 21st IEEE Symposium on Reliable Distributed Systems
Dependability Analysis of a Client/Server Software System with Rejuvenation
ISSRE '02 Proceedings of the 13th International Symposium on Software Reliability Engineering
Dynamic Programming
Software Rejuvenation: Analysis, Module and Applications
FTCS '95 Proceedings of the Twenty-Fifth International Symposium on Fault-Tolerant Computing
A Dynamic Checkpointing Scheme Based on Reinforcement Learning
PRDC '04 Proceedings of the 10th IEEE Pacific Rim International Symposium on Dependable Computing (PRDC'04)
A Comprehensive Model for Software Rejuvenation
IEEE Transactions on Dependable and Secure Computing
ICPADS '05 Proceedings of the 11th International Conference on Parallel and Distributed Systems - Workshops - Volume 02
Distribution-Free Checkpoint Placement Algorithms Based on Min-Max Principle
IEEE Transactions on Dependable and Secure Computing
Optimal Checkpoint Placement with Equality Constraints
DASC '06 Proceedings of the 2nd IEEE International Symposium on Dependable, Autonomic and Secure Computing
Behavioral Analysis of a Fault-Tolerant Software System with Rejuvenation
IEICE - Transactions on Information and Systems
Performability analysis of clustered systems with rejuvenation under varying workload
Performance Evaluation
Analysis of Restart Mechanisms in Software Systems
IEEE Transactions on Software Engineering
Proactive management of software aging
IBM Journal of Research and Development
Optimizing preventive service of software products
IBM Journal of Research and Development
A measurement study of the interplay between application level restart and transport protocol
ISAS'04 Proceedings of the First international conference on Service Availability
Analysis of a service degradation model with preventive rejuvenation
ISAS'06 Proceedings of the Third international conference on Service Availability
Journal of Systems and Software
Job completion time on a virtualized server with software rejuvenation
ACM Journal on Emerging Technologies in Computing Systems (JETC) - Special Issue on Reliability and Device Degradation in Emerging Technologies and Special Issue on WoSAR 2011
Hi-index | 0.00 |
In this paper we consider operational software system with two failure modes and develop a stochastic model to quantify steady-state system availability. Three kinds of preventive/corrective maintenance policies; rejuvenation, restoration and checkpointing, are incorporated in our unified availability model. We propose a dynamic programming algorithm to determine the joint optimal maintenance schedule maximizing the steady-state system availability and calculate the optimal aperiodic checkpoint sequence and preventive rejuvenation time simultaneously. In numerical examples, the sensitivity of model parameters to characterize failure modes are examined, and effects of the preventive/corrective maintenance policies are studied in details.