Performability analysis of clustered systems with rejuvenation under varying workload

Authors:
Dazhi Wang;Wei Xie;Kishor S. Trivedi
Affiliations:
Department of Computer Science, Duke University, Durham, NC 27708, United States;Bank of America, 9 West 57th Street, New York, NY 10019, United States;Department of Electrical and Computer Engineering, Duke University, Durham, NC 27708, United States
Venue:
Performance Evaluation
Year:
2007

Citing 24
Cited 9

A class of generalized stochastic Petri nets for the performance evaluation of multiprocessor systems

ACM Transactions on Computer Systems (TOCS)
An Aggregation Technique for the Transient Analysis of Stiff Markov Chains

IEEE Transactions on Computers
Performability Analysis: Measures, an Algorithm, and a Case Study

IEEE Transactions on Computers - Fault-Tolerant Computing
High-Availability Computer Systems

Computer
An improved numerical algorithm for calculating steady-state solutions of deterministic and stochastic Petri net models

Performance Evaluation
Markov regenerative stochastic Petri nets

Performance '93 Proceedings of the 16th IFIP Working Group 7.3 international symposium on Computer performance modeling measurement and evaluation
Cluster-based scalable network services

Proceedings of the sixteenth ACM symposium on Operating systems principles
Analysis of Preventive Maintenance in Transactions Based Software Systems

IEEE Transactions on Computers
Modeling and analysis of stochastic systems

Modeling and analysis of stochastic systems
Analysis and implementation of software rejuvenation in cluster systems

Proceedings of the 2001 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Probability and statistics with reliability, queuing and computer science applications

Probability and statistics with reliability, queuing and computer science applications
Petri Net Theory and the Modeling of Systems

Petri Net Theory and the Modeling of Systems
Monitoring Smoothly Degrading Systems for Increased Dependability

Empirical Software Engineering
On Petri nets with deterministic and exponentially distributed firing times

Advances in Petri Nets 1987, covers the 7th European Workshop on Applications and Theory of Petri Nets
Transient Analysis of Deterministic and Stochastic Petri Nets by the Method of Supplementary Variables

MASCOTS '95 Proceedings of the 3rd International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems
Transient analysis of Markov regenerative stochastic Petri nets: a comparison of approaches

PNPM '95 Proceedings of the Sixth International Workshop on Petri Nets and Performance Models
Statistical non-parametric algorithms to estimate the optimal software rejuvenation schedule

PRDC '00 Proceedings of the 2000 Pacific Rim International Symposium on Dependable Computing
A Measurement-Based Model for Estimation of Resource Exhaustion in Operational Software Systems

ISSRE '99 Proceedings of the 10th International Symposium on Software Reliability Engineering
An Approach for Estimation of Software Aging in a Web Server

ISESE '02 Proceedings of the 2002 International Symposium on Empirical Software Engineering
Software Rejuvenation: Analysis, Module and Applications

FTCS '95 Proceedings of the Twenty-Fifth International Symposium on Fault-Tolerant Computing
New results for the analysis of deterministic and stochastic Petri nets

IPDS '95 Proceedings of the International Computer Performance and Dependability Symposium on Computer Performance and Dependability Symposium
Software Rejuvenation Policies for Cluster Systems under Varying Workload

PRDC '04 Proceedings of the 10th IEEE Pacific Rim International Symposium on Dependable Computing (PRDC'04)
Proactive management of software aging

IBM Journal of Research and Development
A proactive approach towards always-on availability in broadband cable networks

Computer Communications

A Faster Estimation Algorithm for Periodic Preventive Rejuvenation Schedule Maximizing System Availability

ISAS '07 Proceedings of the 4th international symposium on Service Availability
Analysis of a software system with rejuvenation, restoration and checkpointing

ISAS'08 Proceedings of the 5th international conference on Service availability
Memory leak analysis of mission-critical middleware

Journal of Systems and Software
Analysis of service availability for time-triggered rejuvenation policies

Journal of Systems and Software
Comprehensive evaluation of aperiodic checkpointing and rejuvenation schemes in operational software system

Journal of Systems and Software
On-line adaptive algorithms in autonomic restart control

ATC'10 Proceedings of the 7th international conference on Autonomic and trusted computing
A comparative experimental study of software rejuvenation overhead

Performance Evaluation
Modeling and analysis of software rejuvenation in a server virtualized system with live VM migration

Performance Evaluation
A survey of software aging and rejuvenation studies

ACM Journal on Emerging Technologies in Computing Systems (JETC) - Special Issue on Reliability and Device Degradation in Emerging Technologies and Special Issue on WoSAR 2011

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper develops time-based rejuvenation policies to improve the performability measures of a cluster system. Three rejuvenation policies, namely standard rejuvenation, delayed rejuvenation and mixed rejuvenation, are designed to improve the cluster's performability under varying workload. Analytic models are built to evaluate these three policies. Since deterministic transitions are used in this paper and analytical models based on homogeneous continuous-time Markov chains (CTMC) do not allow non-exponential distributions, we utilize deterministic and stochastic Petri nets (DSPN), in which the underlying stochastic process is a Markov regenerative process (MRGP), to capture both exponential and deterministic distributions. System performability measures under these three rejuvenation policies are derived based on the DSPN models. We show that the mixed rejuvenation policy achieves the maximum performability among the three policies, which results in 12% improvement on the system throughput in the example shown in this paper. The delayed rejuvenation is better than the standard rejuvenation with respect to the optimal job blocking probability and system throughput. For longer rejuvenation-triggering intervals, the standard rejuvenation yields a better result than delayed rejuvenation, while for shorter rejuvenation-triggering intervals the delayed rejuvenation policy outperforms standard rejuvenation policy.