Spare Capacity as a Means of Fault Detection and Diagnosis in Multiprocessor Systems

  • Authors:
  • A. T. Dahbura;K. K. Sabnani;W. J. Hery

  • Affiliations:
  • -;-;-

  • Venue:
  • IEEE Transactions on Computers
  • Year:
  • 1989

Quantified Score

Hi-index 14.99

Visualization

Abstract

A technique for detecting and diagnosing faults at the processor level in a multiprocessor system is described. A process is assigned whenever possible to two processors: the processor to which it would normally be assigned (primarily) and an additional processor that would otherwise be idle (secondary). Two strategies are described and analyzed: one that is preemptive and another that is nonpreemptive. It is shown that, for moderately loaded systems, a sufficient percentage of processes can be performed redundantly using the system's spare capacity to provide a basis for fault detection and diagnosis with virtually no degradation of response time. A multiprocessor that uses the approach for detecting faults at the processor loads is described.