Adaptive optimal checkpoint interval and its impact on system's overall quality in soft real-time applications

  • Authors:
  • Nianen Chen;Shangping Ren

  • Affiliations:
  • Illinois Institute of Technology, Chicago, IL;Illinois Institute of Technology, Chicago, IL

  • Venue:
  • Proceedings of the 2009 ACM symposium on Applied Computing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Soft real-time systems often have to consider both timing and probabilistic fault-tolerance requirements. When checkpointing techniques are used for fault tolerance purposes, the checkpointing frequency unyieldingly affects the system's overall quality measured by an integrated value of system QoS properties, such as availability, task execution time, and task deadline miss probability. In this paper, we first formally analyze the relationships between checkpoint interval and system availability, task execution time, and task deadline miss probability, respectively by considering a Poisson probabilistic fault model. We further define the system's overall quality as a weighted sum of these three QoS measures, from which an optimization problem is formulated to decide the checkpoint interval that maximizes system's overall quality. Also presented in the paper are a prototype implementation of a framework that allows adaptive checkpointing and a set of experiments executed upon the framework that further validate our analytical results.