Partitioned Encoding Schemes for Algorithm-Based Fault Tolerance in Massively Parallel Systems

Authors:
J. Rexford;N. K. Jha
Affiliations:
-;-
Venue:
IEEE Transactions on Parallel and Distributed Systems
Year:
1994

Citing 9
Cited 3

Bounds on Algorithm-Based Fault Tolerance in Multiple Processor Systems

IEEE Transactions on Computers - The MIT Press scientific computation series
Fault Tolerance Techniques for Systolic Arrays

Computer
Fault-Tolerant FFT Networks

IEEE Transactions on Computers
A Fault-Tolerant FFT Processor

IEEE Transactions on Computers
A Fault-Tolerant Systolic Sorter

IEEE Transactions on Computers
A Linear Algebraic Model of Algorithm-Based Fault Tolerance

IEEE Transactions on Computers
A novel approach to system-level fault tolerance in hypercube multiprocessors

C3P Proceedings of the third conference on Hypercube concurrent computers and applications: Architecture, software, computer systems, and general issues - Volume 1
Algorithm-Based Fault Detection for Signal Processing Applications

IEEE Transactions on Computers
Diagnosability and Diagnosis of Algorithm-Based Fault-Tolerant Systems

IEEE Transactions on Computers

A New Algorithm Based on Givens Rotations for Solving Linear Equations on Fault-Tolerant Mesh-Connected Processors

IEEE Transactions on Parallel and Distributed Systems
Using Data Flow Information to Obtain Efficient Check Sets for Algorithm-Based Fault Tolerance

International Journal of Parallel Programming
Efficient Self-Recovering ASIC Design

IEEE Design & Test

Quantified Score

Hi-index	0.00

Visualization

Abstract

Considers the applicability of algorithm based fault tolerance (ABET) to massively parallel scientific computation. Existing ABET schemes can provide effective fault tolerance at a low cost For computation on matrices of moderate size; however, the methods do not scale well to floating-point operations on large systems. This short note proposes the use of a partitioned linear encoding scheme to provide scalability. Matrix algorithms employing this scheme are presented and compared to current ABET schemes. It is shown that the partitioned scheme provides scalable linear codes with improved numerical properties with only a small increase in hardware and time overhead.