Fault Tolerance in Highly Parallel Hardware Systems

  • Authors:
  • K. E. Grosspietsch

  • Affiliations:
  • -

  • Venue:
  • IEEE Micro
  • Year:
  • 1994

Quantified Score

Hi-index 0.00

Visualization

Abstract

As the demand for highly parallel systems grows, the vast amount of concurrently operating hardware involved can make it difficult to guarantee proper system behavior. Problems arise both from permanent and transient hardware faults and from errors caused by improper programming. A number of fault tolerance solutions have emerged. Following a survey of fault tolerance in arrays, a discussion of solutions for more specialized architectures is presented.