Using Accelerated Life Tests to Estimate Time to Software Aging Failure

Authors:
Rivalino Matias Jr.;Kishor S. Trivedi;Paulo R. M. Maciel
Affiliations:
-;-;-
Venue:
ISSRE '10 Proceedings of the 2010 IEEE 21st International Symposium on Software Reliability Engineering
Year:
2010

Citing 0
Cited 6

Predicting aging-related bugs using software complexity metrics

Performance Evaluation
How does testing affect the availability of aging software systems?

Performance Evaluation
A survey of software aging and rejuvenation studies

ACM Journal on Emerging Technologies in Computing Systems (JETC) - Special Issue on Reliability and Device Degradation in Emerging Technologies and Special Issue on WoSAR 2011
Software rejuvenation scheduling using accelerated life testing

ACM Journal on Emerging Technologies in Computing Systems (JETC) - Special Issue on Reliability and Device Degradation in Emerging Technologies and Special Issue on WoSAR 2011
Job completion time on a virtualized server with software rejuvenation

ACM Journal on Emerging Technologies in Computing Systems (JETC) - Special Issue on Reliability and Device Degradation in Emerging Technologies and Special Issue on WoSAR 2011
A comprehensive approach to optimal software rejuvenation

Performance Evaluation

Quantified Score

Hi-index	0.00

Visualization

Abstract

Software aging is a phenomenon defined as the continuing degradation of software systems during runtime, being particularly noticeable in long-running applications. Aging-related failures are very difficult to observe, because the accumulation of aging effects usually requires a long-term execution. Thus, collecting a statistically significant sample of times to aging-related failures so as to estimate the system’s lifetime distribution is a very hard task. This is an important problem that prevents many experimental and analytical studies, mainly those focused on modeling of software aging aspects, of using representative parameter values. In this paper we propose and evaluate the use of quantitative accelerated life tests (QALT) to reduce the time to obtain the lifetime distribution of systems that fail due to software aging. Since QALT was developed for hardware failures, in this paper, we adapt it to software aging experiments. We test the proposed approach experimentally, estimating the lifetime distribution of a real web server system. The accuracy of the estimated distribution is evaluated by comparing its reliability estimates with a sample of failure times observed from the real system under test. The mean time to failure calculated from the real sample falls inside the 90% confidence interval constructed from the estimated lifetime distribution, demonstrating the high accuracy of the estimated model. The proposed approach reduces the time required to obtain the failure times by a factor of seven, for the real system investigated.