Reliability of task graph schedules with transient and fail-stop failures: complexity and algorithms

  • Authors:
  • Anne Benoit;Louis-Claude Canon;Emmanuel Jeannot;Yves Robert

  • Affiliations:
  • LIP, ENS Lyon, Lyon Cedex 07, France 69364 and Institut Universitaire de France, Paris, France;Nancy University, Nancy Cedex, France 54052 and INRIA, Le Chesnay Cedex, France;LaBRI, Talence Cedex, France 33405 and INRIA Bordeaux, Bordeaux Cedex, France;LIP, ENS Lyon, Lyon Cedex 07, France 69364 and Institut Universitaire de France, Paris, France

  • Venue:
  • Journal of Scheduling
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper deals with the reliability of task graph schedules with transient and fail-stop failures. While computing the reliability of a given schedule is easy in the absence of task replication, the problem becomes much more difficult when task replication is used. We fill a complexity gap of the scheduling literature: our main result is that this reliability problem is #P驴-Complete (hence at least as hard as NP-Complete problems), both for transient and for fail-stop processor failures. We also study the evaluation of a restricted class of schedules, where a task cannot be scheduled before all replicas of all its predecessors have completed their execution. Although the complexity in this case with fail-stop failures remains open, we provide an algorithm to estimate the reliability while limiting evaluation costs, and we validate this approach through simulations.