Floating Point Fault Tolerance with Backward Error Assertions

Authors:
Daniel Boley;Gene H. Golub;Samy Makar;Nirmal Saxena;Edward J. McCluskey
Affiliations:
-;-;-;-;-
Venue:
IEEE Transactions on Computers - Special issue on fault-tolerant computing
Year:
1995

Citing 10
Cited 5

An analysis of algorithm-based fault tolerance techniques

Journal of Parallel and Distributed Computing
The algebraic eigenvalue problem

The algebraic eigenvalue problem
Adaptive filter theory (2nd ed.)

Adaptive filter theory (2nd ed.)
On multiple error detection in matrix triangularizations using checksum methods

Journal of Parallel and Distributed Computing
Iterative refinement enhances the stability of QR factorization methods for solving linear equations

BIT
Matrix computations (3rd ed.)

Matrix computations (3rd ed.)
Error Analysis of Direct Methods of Matrix Inversion

Journal of the ACM (JACM)
Fault Tolerance: Principles and Practice

Fault Tolerance: Principles and Practice
Concurrent Error Detection Using Watchdog Processors-A Survey

IEEE Transactions on Computers
Backward error assertions for checking solutions to systems of linear equations

Backward error assertions for checking solutions to systems of linear equations

Mantissa-Preserving Operations and Robust Algorithm-Based Fault Tolerance for Matrix Computations

IEEE Transactions on Computers
Extending Backward Error Assertions to Tolerance of Large Errors in Floating Point Computations

IEEE Transactions on Computers
Fault tolerant matrix operations using checksum and reverse computation

FRONTIERS '96 Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation
NetSolve/D: A Massively Parallel Grid Execution System for Scalable Data Intensive Collaboration

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 10 - Volume 11
Parallel reduction to hessenberg form with algorithm-based fault tolerance

SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis

Quantified Score

Hi-index	0.01

Visualization

Abstract

This paper introduces an assertion scheme based on the backward error analysis for error detection in algorithms that solve dense systems of linear equations, $A\mbi{x} = \mbi{b}$. Unlike previous methods, this Backward Error Assertion Model is specifically designed to operate in an environment of floating point arithmetic subject to round-off errors, and it can be easily instrumented in a Watchdog processor environment. The complexity of verifying assertions is $O(n^2)$, compared to the $O(n^3)$ complexity of algorithms solving $A\mbi{x} = \mbi{b}$. Unlike other proposed error detection methods, this assertion model does not require any encoding of the matrix $A$. Experimental results under various error models are presented to validate the effectiveness of this assertion scheme.