Fault-secure algorithms for multiple-processor systems

  • Authors:
  • Prithviraj Banerjee;Jacob A. Abraham

  • Affiliations:
  • Computer Systems Group, Coordinated Science Laboratory, University of Illinois at Urbana-Champaign;Computer Systems Group, Coordinated Science Laboratory, University of Illinois at Urbana-Champaign

  • Venue:
  • ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
  • Year:
  • 1984

Quantified Score

Hi-index 0.02

Visualization

Abstract

In this paper we describe techniques for achieving fault secureness with low cost in multiple processor7 systems. In order to do this we consider the relationshipsN between algorithms, parallel architectures, and fault tolerance. The concept of fault-secure algorithms, described in this paper, involves the application of the ideas of fault tolerance at the system level to high-performance multiple-processor algorithms to make the results of the computation reliable. Algorithms are classified into broad classes called paradigms which are determined exclusively by the communication patterns of the processors. Fault-secure techniques are presented for three powerful paradigms: the multiplex, the recursive combination, and the multiplex-demultiplex paradigms. The basic idea used in the design of fault-tolerant algorithms is that the algorithms operate on encoded input data and produce encoded output data such that the over-head in time and number of processors is not high. This technique is distinguished by three characteristics: the encoding of the data used by the algorithm, the re-design of the algorithm to operate on the encoded data, and the distribution of the computation steps in the algorithm among the computation units.