Construction of Check Sets for Algorithm-Based Fault Tolerance

  • Authors:
  • D. Gu;D. J. Rosenkrantz;S. S. Ravi

  • Affiliations:
  • -;-;-

  • Venue:
  • IEEE Transactions on Computers
  • Year:
  • 1994

Quantified Score

Hi-index 14.99

Visualization

Abstract

Algorithm-based fault tolerance (ABFT) is a popular approach to achieve fault and error detection in multiprocessor systems. The design problem for ABFT is concerned with the construction of a check set of minimum cardinality that detects a specified number of errors or faults. Previous work on this problem has assumed an a priori bound on the size of a check. We motivate and carry out an investigation of the problem without the bounded check size assumption. We establish upper and lower bounds on the number of checks needed to detect a given number of errors. The upper bounds are obtained through new schemes which are easy to implement, and the lower bounds are established using new types of arguments. These bounds are sharply different from those previously established under the bounded check size model. We also show that unlike error detection, the design problem for fault detection is NP-hard even for detecting only one fault.