Quantifying fault recovery in multiprocessor systems

  • Authors:
  • Frank Harary;Miroslaw Malek

  • Affiliations:
  • New Mexico State University Las Cruces, NM 88003, U.S.A.;Department of Electrical and Computer Engineering The University of Texas at Austin, Austin, TX 78712-1084, U.S.A.

  • Venue:
  • Mathematical and Computer Modelling: An International Journal
  • Year:
  • 1993

Quantified Score

Hi-index 0.98

Visualization

Abstract

We formalize and quantify various aspects of reliable computing with emphasis on efficient fault recovery. The mathematical model which proves to be most appropriate is provided by the theory of graphs. We have developed new measures for fault recovery and observe that the value of elements of the fault recovery vector depend not only on the computation graph H and the architecture graph G, but also on the specific location of a fault. In our examples, we choose a hypercube as a representative of parallel computer architecture, and a pipeline as a typical configuration for program execution. We define dependability qualities of such a system with or witout a fault. These qualities are determined by the resiliency triple defined by three parameters: multiplicity, robustness, and configurability. We also introduce parameters for measuring the recovery effectiveness in terms of distance, time, and the number of new, used, and moved nodes and edges.