Algorithms for testing fault-tolerance of sequenced jobs

  • Authors:
  • Marek Chrobak;Mathilde Hurand;Jiří Sgall

  • Affiliations:
  • Department of Computer Science, University of California, Riverside, USA;Department d'Informatique (LIX), Ecole Polytechnique, Palaiseau, France;Department of Applied Mathematics, Charles University, Praha 1, Czech Republic 11800

  • Venue:
  • Journal of Scheduling
  • Year:
  • 2009

Quantified Score

Hi-index 0.01

Visualization

Abstract

We study the problem of testing whether a given set of sequenced jobs can tolerate transient faults. We present efficient algorithms for this problem in several fault models. A fault model describes what types of faults are allowed and specifies assumptions on their frequency. Two types of faults are considered: hidden faults, that can only be detected after a job completes, and exposed faults, that can be detected immediately.First, we give an O(n)-time fault-tolerance testing algorithm, for both exposed and hidden faults, if the number of faults does not exceed a given parameter k.Then we consider the model in which any two faults are separated in time by a gap of length at least Δ, where Δ is at least twice the maximum job length. For exposed faults, we give an O(n)-time algorithm. For hidden faults, we give an algorithm with running time O(n 2), and we prove that if job lengths are distributed uniformly over an interval [0,p max驴], then this algorithm's expected running time is O(n). Our experimental study shows that this linear-time performance extends to other distributions. Finally, we provide evidence that improving the worst-case performance may not be possible, by proving an 驴(n 2) lower bound, in the algebraic computation tree model, on a slight generalization of this problem.