Analysis of Restart Mechanisms in Software Systems

  • Authors:
  • Aad P. A. van Moorsel;Katinka Wolter

  • Affiliations:
  • -;-

  • Venue:
  • IEEE Transactions on Software Engineering
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Restarts or retries are a common phenomenon in computing systems, for instance, in preventive maintenance, software rejuvenation, or when a failure is suspected. Typically, one sets a time-out to trigger the restart. We analyze and optimize time-out strategies for scenarios in which the expected required remaining time of a task is not always decreasing with the time invested in it. Examples of such tasks include the download of Web pages, randomized algorithms, distributed queries, and jobs subject to network or other failures. Assuming the independence of the completion time of successive tries, we derive computationally attractive expressions for the moments of the completion time, as well as for the probability that a task is able to meet a deadline. These expressions facilitate efficient algorithms to compute optimal restart strategies and are promising candidates for pragmatic online optimization of restart timers.