Dynamic Fault Reconfiguration in a Mesh-Connected MIMD Environment

  • Authors:
  • M. U:9AK. Uyar;A. P. Reeves

  • Affiliations:
  • AT&T Bell Labs, Homdel, NJ;Univ. of Illinois, Urbana-Champaign

  • Venue:
  • IEEE Transactions on Computers
  • Year:
  • 1988

Quantified Score

Hi-index 14.98

Visualization

Abstract

The near-neighbor problem is characterized by many iterations of a parallel matrix operation in which each matrix element is recomputed as a function of itself and its immediately adjacent near neighbors. Several highly parallel computer systems have been designed with the near-neighbor class of problems as the target application. As the number of processors in evolving parallel computer systems increases, the capability of fault tolerance to processor failures becomes more important. The authors show how fault tolerance can be efficiently achieved on an MIMD (multiple-instruction, multiple-data-stream) computer system for the near-neighbor problem by task redistribution. The techniques discussed minimize the extra data transfers and/or the extra computation in the system with faulty processors and links.