Analysis of Preventive Maintenance in Transactions Based Software Systems
IEEE Transactions on Computers
On the Optimum Checkpoint Interval
Journal of the ACM (JACM)
Performance of rollback recovery systems under intermittent failures
Communications of the ACM
Performance of Computer Communication Systems: A Model-Based Approach
Performance of Computer Communication Systems: A Model-Based Approach
A survey of rollback-recovery protocols in message-passing systems
ACM Computing Surveys (CSUR)
Petri Net Modelling and Performability Evaluation with TimeNET 3.0
TOOLS '00 Proceedings of the 11th International Conference on Computer Performance Evaluation: Modelling Techniques and Tools
Modeling and Analysis of Software Aging and Rejuvenation
SS '00 Proceedings of the 33rd Annual Simulation Symposium
Software Rejuvenation: Analysis, Module and Applications
FTCS '95 Proceedings of the Twenty-Fifth International Symposium on Fault-Tolerant Computing
A Generic Availability Model for Clustered Computing Systems
PRDC '01 Proceedings of the 2001 Pacific Rim International Symposium on Dependable Computing
Performability analysis of clustered systems with rejuvenation under varying workload
Performance Evaluation
Analysis of Restart Mechanisms in Software Systems
IEEE Transactions on Software Engineering
Service Availability of Systems with Failure Prevention
APSCC '08 Proceedings of the 2008 IEEE Asia-Pacific Services Computing Conference
Modeling user-perceived service availability
ISAS'05 Proceedings of the Second international conference on Service Availability
How does testing affect the availability of aging software systems?
Performance Evaluation
A survey of software aging and rejuvenation studies
ACM Journal on Emerging Technologies in Computing Systems (JETC) - Special Issue on Reliability and Device Degradation in Emerging Technologies and Special Issue on WoSAR 2011
Hi-index | 0.00 |
In this paper we investigate the effect of three time-triggered system rejuvenation policies on service availability using a queuing model. The model is formulated as an extended stochastic Petri net using a variety of distributions for times between state changes. We define a metric for steady-state service availability and derive how it can be estimated from the models in a hybrid approach combining simulation and analytical reasoning. We further analyze time-to-failure of systems with rejuvenation. Experiments show that the optimal rejuvenation interval as well as the achievable service availability improvement depend significantly on system utilization. The experiments also show that service availability can deviate significantly from steady-state system availability. For low utilization all rejuvenation policies perform well. For medium utilization, one policy is significantly inferior to the other two, while for high utilization, no rejuvenation should be performed at all.