Fault-secure algorithms for multiple-processor systems

Authors:
Prithviraj Banerjee;Jacob A. Abraham
Affiliations:
Computer Systems Group, Coordinated Science Laboratory, University of Illinois at Urbana-Champaign;Computer Systems Group, Coordinated Science Laboratory, University of Illinois at Urbana-Champaign
Venue:
ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
Year:
1984

Citing 9
Cited 16

Discrete-time signal processing

Discrete-time signal processing
The Parallel Evaluation of General Arithmetic Expressions

Journal of the ACM (JACM)
Bounds to Complexities of Networks for Sorting and for Switching

Journal of the ACM (JACM)
The cube-connected cycles: a versatile network for parallel computation

Communications of the ACM
The Design and Analysis of Computer Algorithms

The Design and Analysis of Computer Algorithms
Structure of Computers and Computations

Structure of Computers and Computations
Area-Time Optimal VLSI Networks for Computing Integer Multiplications and Discrete Fourier Transform

Proceedings of the 8th Colloquium on Automata, Languages and Programming
Fault-tolerant wafer-scale architectures for VLSI

ISCA '82 Proceedings of the 9th annual symposium on Computer Architecture
A reconfigurable and fault-tolerant VLSI multiprocessor array

ISCA '81 Proceedings of the 8th annual symposium on Computer Architecture

Bounds on Algorithm-Based Fault Tolerance in Multiple Processor Systems

IEEE Transactions on Computers - The MIT Press scientific computation series
Fault-Tolerant FFT Networks

IEEE Transactions on Computers
A Fault-Tolerant FFT Processor

IEEE Transactions on Computers
A novel approach to system-level fault tolerance in hypercube multiprocessors

C3P Proceedings of the third conference on Hypercube concurrent computers and applications: Architecture, software, computer systems, and general issues - Volume 1
The de Bruijn Multiprocessor Network: A Versatile Parallel Processing and Sorting Network for VLSI

IEEE Transactions on Computers
Tradeoffs in the Design of Efficient Algorithm-Based Error Detection Schemes for Hypercube Multiprocessors

IEEE Transactions on Software Engineering
Real-Number Codes for Fault-Tolerant Matrix Operations on Processor Arrays

IEEE Transactions on Computers
Algorithm-Based Fault Detection for Signal Processing Applications

IEEE Transactions on Computers
Algorithm-Based Fault Tolerance on a Hypercube Multiprocessor

IEEE Transactions on Computers
High-level synthesis of fault-secure microarchitectures

DAC '93 Proceedings of the 30th international Design Automation Conference
Design of sytems with concurrent error detection using software redundancy

ACM '86 Proceedings of 1986 ACM Fall joint computer conference
Introspection: A register transfer level technique for cocurrent error detection and diagnosis in data dominated designs

ACM Transactions on Design Automation of Electronic Systems (TODAES)
Improved Bounds for Algorithm-Based Fault Tolerance

IEEE Transactions on Computers
Automatic Synthesis of Self-Recovering VLSI Systems

IEEE Transactions on Computers
Linear Complexity Assertions for Sorting

IEEE Transactions on Software Engineering
Scalable Resource Allocation for Multi-Processor QoS Optimization

ICDCS '03 Proceedings of the 23rd International Conference on Distributed Computing Systems

Quantified Score

Hi-index	0.02

Visualization

Abstract

In this paper we describe techniques for achieving fault secureness with low cost in multiple processor7 systems. In order to do this we consider the relationshipsN between algorithms, parallel architectures, and fault tolerance. The concept of fault-secure algorithms, described in this paper, involves the application of the ideas of fault tolerance at the system level to high-performance multiple-processor algorithms to make the results of the computation reliable. Algorithms are classified into broad classes called paradigms which are determined exclusively by the communication patterns of the processors. Fault-secure techniques are presented for three powerful paradigms: the multiplex, the recursive combination, and the multiplex-demultiplex paradigms. The basic idea used in the design of fault-tolerant algorithms is that the algorithms operate on encoded input data and produce encoded output data such that the over-head in time and number of processors is not high. This technique is distinguished by three characteristics: the encoding of the data used by the algorithm, the re-design of the algorithm to operate on the encoded data, and the distribution of the computation steps in the algorithm among the computation units.