Fault-Detection by Result-Checking for the Eigenproblem

Authors:
Paula Prata;João Gabriel Silva
Affiliations:
-;-
Venue:
EDCC-3 Proceedings of the Third European Dependable Computing Conference on Dependable Computing
Year:
1999

Citing 19
Cited 0

Algorithm-Based Fault Detection for Signal Processing Applications

IEEE Transactions on Computers
Algorithm-Based Fault Tolerance on a Hypercube Multiprocessor

IEEE Transactions on Computers
Fundamentals of matrix computations

Fundamentals of matrix computations
A mathematical theory of self-checking, self-testing and self-correcting programs

A mathematical theory of self-checking, self-testing and self-correcting programs
LAPACK's user's guide

LAPACK's user's guide
Concurrent scientific computing

Concurrent scientific computing
Designing programs that check their work

Journal of the ACM (JACM)
Reflections on the Pentium Division Bug

IEEE Transactions on Computers
Applied numerical linear algebra

Applied numerical linear algebra
Software reliability via run-time result-checking

Journal of the ACM (JACM)
Accuracy and Stability of Numerical Algorithms

Accuracy and Stability of Numerical Algorithms
Xception: A Technique for the Experimental Evaluation of Dependability in Modern Computers

IEEE Transactions on Software Engineering
On Stratified Sampling for High Coverage Estimations

EDCC-2 Proceedings of the Second European Dependable Computing Conference on Dependable Computing
Experimental assessment of parallel systems

FTCS '96 Proceedings of the The Twenty-Sixth Annual International Symposium on Fault-Tolerant Computing (FTCS '96)
Experimental evaluation of the fail-silent behaviour in programs with consistency checks

FTCS '96 Proceedings of the The Twenty-Sixth Annual International Symposium on Fault-Tolerant Computing (FTCS '96)
Practical Issues in the Use of ABFT and a New Failure Model

FTCS '98 Proceedings of the The Twenty-Eighth Annual International Symposium on Fault-Tolerant Computing
Algorithm Based Fault Tolerance versus Result-Checking for Matrix Computations

FTCS '99 Proceedings of the Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing
Algorithm-Based Fault Tolerance for Matrix Operations

IEEE Transactions on Computers
On the robustness of functional equations

SFCS '94 Proceedings of the 35th Annual Symposium on Foundations of Computer Science

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes a new fault detection mechanism for the computation of eigenvalues and eigenvectors, the so called eigenproblem, for which no such scheme existed before, to the best of our knowledge. It consists of a number of assertions that can be executed on the results of the computation to determine their correctness. The proposed scheme follows the Result Checking principle, since it does not depend on the particular numerical algorithm used. It can handle both real and complex matrices, symmetric or not. Many practical issues are handled, like rounding errors and eigenvalue ordering, and a practical implementation was built on top of unmodified routines of the well-known LAPACK library. The proposed scheme is simultaneously very efficient, with less than 2% performance overhead for medium to large matrices, very effective, since it exhibited a fault coverage greater than 99.7% with a confidence level of 99%, when subjected to extensive fault-injection experiments, and very easy to adapt to other libraries of mathematical routines besides LAPACK.