Job completion time on a virtualized server with software rejuvenation

Authors:
Fumio Machida;Victor F. Nicola;Kishor S. Trivedi
Affiliations:
NEC Knowledge Discovery Research Laboratories, NEC Corporation;University of Twente;Duke University
Venue:
ACM Journal on Emerging Technologies in Computing Systems (JETC) - Special Issue on Reliability and Device Degradation in Emerging Technologies and Special Issue on WoSAR 2011
Year:
2014

Citing 13
Cited 0

Queueing Analysis of Fault-Tolerant Computer Systems

IEEE Transactions on Software Engineering
Minimizing completion time of a program by checkpointing and rejuvenation

Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Probability and statistics with reliability, queuing and computer science applications

Probability and statistics with reliability, queuing and computer science applications
The Completion Time of Programs on Processors Subject to Failure and Repair

IEEE Transactions on Computers
SPNP: Stochastic Petri Net Package

PNPM '89 The Proceedings of the Third International Workshop on Petri Nets and Performance Models
Software Rejuvenation: Analysis, Module and Applications

FTCS '95 Proceedings of the Twenty-Fifth International Symposium on Fault-Tolerant Computing
A Comprehensive Model for Software Rejuvenation

IEEE Transactions on Dependable and Secure Computing
A Fast Rejuvenation Technique for Server Consolidation with Virtual Machines

DSN '07 Proceedings of the 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks
Analysis of a software system with rejuvenation, restoration and checkpointing

ISAS'08 Proceedings of the 5th international conference on Service availability
Using Accelerated Life Tests to Estimate Time to Software Aging Failure

ISSRE '10 Proceedings of the 2010 IEEE 21st International Symposium on Software Reliability Engineering
Candy: Component-based Availability Modeling Framework for Cloud Service Management Using SysML

SRDS '11 Proceedings of the 2011 IEEE 30th International Symposium on Reliable Distributed Systems
Injecting Memory Leaks to Accelerate Software Failures

ISSRE '11 Proceedings of the 2011 IEEE 22nd International Symposium on Software Reliability Engineering
Job Completion Time on a Virtualized Server Subject to Software Aging and Rejuvenation

WOSAR '11 Proceedings of the 2011 IEEE Third International Workshop on Software Aging and Rejuvenation

Quantified Score

Hi-index	0.00

Visualization

Abstract

This article analyzes the completion time of a job running on a virtualized server subject to software aging and rejuvenation in a virtual machine monitor (VMM). A job running on the server may be interrupted by virtual machine (VM) failure, VMM failure or VMM rejuvenation. The job interruption is categorized as either preemptive-repeat (prt), in which case the interrupted job needs to restart from the beginning, or preemptive-resume (prs), in which case the job resumes execution from the point of interruption. Using a semi-Markov process (SMP) to model the server behavior, the steady-state server availability is computed and the theory developed in Kulkarni et al. [1987] is used to obtain the Laplace-Stieltjes transform (LST) of the job completion time. In the numerical experiments, we introduce four types of aging behavior of VMM. The effectiveness of VMM rejuvenation on job completion time is discussed in association with the type of interruption it causes and the VMM aging type. With our parameter settings, VMM rejuvenation with prs job interruption improves the performance of job execution regardless of the aging type, with performance degradation is taken into account.