Fault Tolerance in Multiprocessor Systems Without Dedicated Redundancy

  • Authors:
  • P. Agrawal

  • Affiliations:
  • -

  • Venue:
  • IEEE Transactions on Computers
  • Year:
  • 1988

Quantified Score

Hi-index 14.99

Visualization

Abstract

An algorithm called RAFT (recursive algorithm for fault tolerance) for achieving fault tolerance in multiprocessor systems is described. Through the use of a combination of dynamic space- and time- redundancy techniques, RAFT achieves fault tolerance in the presence of permanent as well as intermittent faults. Performance and reliability of multiprocessor systems using RAFT are determined as a function of individual processor reliability and the total number of fault modes in a processor. RAFT-based systems are superior to triple modular redundancy (TMR) systems in hardware economy and provide comparable reliability. A multiprocessor architecture adopting RAFT is given.