Modeling and analysis of software aging and software failure

  • Authors:
  • Letian Jiang;Guozhi Xu

  • Affiliations:
  • Electronic Engineering Department, EE Building No. 1, Shanghai Jiaotong University, 800 Dongchuan Road, Shanghai 200240, People's Republic of China;Electronic Engineering Department, EE Building No. 1, Shanghai Jiaotong University, 800 Dongchuan Road, Shanghai 200240, People's Republic of China

  • Venue:
  • Journal of Systems and Software
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many studies reported that system suffered from outages more due to software faults than hardware faults. Recently, the phenomenon of ''software aging'', which was caused by aging-related faults, is observed in many software systems. Software aging, characterized by progressive performance degradation, is mainly caused by exhaustion of the operating system resources, such as memory leaking, unreleased-file locks, data corruption, etc. This paper mainly focuses on the modeling and analysis of software aging and software failure. A stochastic time series decomposition algorithm based on robust locally weighted regression (Loess) is presented to separate the exhaustion of system resource from the resource usage, from which aging trend is estimated. Then the model of software aging and software failure process is constructed. Experiments on a practical server system verify the effectiveness of the algorithm presented in this paper, and the two-stage failure process is also confirmed for the first time in the history of research on software aging. The conclusions drawn from this paper will greatly benefit the application of software rejuvenation technique, that is, it makes it easy to determine when to perform software rejuvenation, which is a key issue in implementation of software rejuvenation. The results for the server system employing different rejuvenation policies show that software performance can be effectively improved.