Some Practical Issues in the Design of Fault-Tolerant Multiprocessors

  • Authors:
  • Shantanu Dutt;John P. Hayes

  • Affiliations:
  • -;-

  • Venue:
  • IEEE Transactions on Computers - Special issue on fault-tolerant computing
  • Year:
  • 1992

Quantified Score

Hi-index 0.01

Visualization

Abstract

Methods for modeling and implementing various practical aspects of fault-tolerant multiprocessor systems largely neglected in prior research are examined. The node-covering design approach is generalized to accommodate systems whose structure and failure mechanisms are represented by arbitrary graphs. Several new types of covering graphs are defined, which lead to various useful design tradeoffs. A new technique for incremental design is presented, using a class of switch implementations that reduce a system's interconnection costs. The reduction of other cost factors is also addressed, and methods are presented for VLSI layout area minimization, fast and distributed reconfiguration, efficient transfer of state information for software recovery, and the efficient use of local spares.