An analysis of algorithm-based fault tolerance techniques
Journal of Parallel and Distributed Computing
The algebraic eigenvalue problem
The algebraic eigenvalue problem
Adaptive filter theory (2nd ed.)
Adaptive filter theory (2nd ed.)
On multiple error detection in matrix triangularizations using checksum methods
Journal of Parallel and Distributed Computing
Matrix computations (3rd ed.)
Error Analysis of Direct Methods of Matrix Inversion
Journal of the ACM (JACM)
Fault Tolerance: Principles and Practice
Fault Tolerance: Principles and Practice
Concurrent Error Detection Using Watchdog Processors-A Survey
IEEE Transactions on Computers
Backward error assertions for checking solutions to systems of linear equations
Backward error assertions for checking solutions to systems of linear equations
Mantissa-Preserving Operations and Robust Algorithm-Based Fault Tolerance for Matrix Computations
IEEE Transactions on Computers
Extending Backward Error Assertions to Tolerance of Large Errors in Floating Point Computations
IEEE Transactions on Computers
Fault tolerant matrix operations using checksum and reverse computation
FRONTIERS '96 Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation
NetSolve/D: A Massively Parallel Grid Execution System for Scalable Data Intensive Collaboration
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 10 - Volume 11
Parallel reduction to hessenberg form with algorithm-based fault tolerance
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Hi-index | 0.01 |
This paper introduces an assertion scheme based on the backward error analysis for error detection in algorithms that solve dense systems of linear equations, $A\mbi{x} = \mbi{b}$. Unlike previous methods, this Backward Error Assertion Model is specifically designed to operate in an environment of floating point arithmetic subject to round-off errors, and it can be easily instrumented in a Watchdog processor environment. The complexity of verifying assertions is $O(n^2)$, compared to the $O(n^3)$ complexity of algorithms solving $A\mbi{x} = \mbi{b}$. Unlike other proposed error detection methods, this assertion model does not require any encoding of the matrix $A$. Experimental results under various error models are presented to validate the effectiveness of this assertion scheme.