Bounds on Algorithm-Based Fault Tolerance in Multiple Processor Systems
IEEE Transactions on Computers - The MIT Press scientific computation series
IEEE Transactions on Computers
A Fault-Tolerant FFT Processor
IEEE Transactions on Computers
A Fault-Tolerant Systolic Sorter
IEEE Transactions on Computers
Fault-Tolerant Matrix Triangularizations on Systolic Arrays
IEEE Transactions on Computers
A Linear Algebraic Model of Algorithm-Based Fault Tolerance
IEEE Transactions on Computers
Real-Number Codes for Fault-Tolerant Matrix Operations on Processor Arrays
IEEE Transactions on Computers
Algorithm-Based Fault Detection for Signal Processing Applications
IEEE Transactions on Computers
Algorithm-Based Fault Tolerance on a Hypercube Multiprocessor
IEEE Transactions on Computers
Probabilistic Evaluation of Online Checks in Fault-Tolerant Multiprocessor Systems
IEEE Transactions on Computers - Special issue on fault-tolerant computing
Switching and Finite Automata Theory: Computer Science Series
Switching and Finite Automata Theory: Computer Science Series
Optimal Design of Checks for Error Detection and Location in Fault-Tolerant Multiprocessor Systems
IEEE Transactions on Computers
A theory for algorithm-based fault tolerance in array processor systems (checks, bounds, graph-theoretic, errors)
Analysis and design of algorithm-based fault-tolerant systems
Analysis and design of algorithm-based fault-tolerant systems
Algorithm-Based Fault Location and Recovery for Matrix Computations on Multiprocessor Systems
IEEE Transactions on Computers
Graceful Degradation in Algorithm-Based Fault Tolerant Multiprocessor Systems
IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
Using Data Flow Information to Obtain Efficient Check Sets for Algorithm-Based Fault Tolerance
International Journal of Parallel Programming
Optimal Design of Checks for Error Detection and Location in Fault-Tolerant Multiprocessor Systems
IEEE Transactions on Computers
Synthesis of Algorithm-Based Fault-Tolerant Systems from Dependence Graphs
IEEE Transactions on Parallel and Distributed Systems
Partitioned Encoding Schemes for Algorithm-Based Fault Tolerance in Massively Parallel Systems
IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
Hi-index | 14.99 |
Parallel processing architectures are commonly used for signal processing and other computationally intensive applications. These applications are characterized by high throughput and long processing periods. Such characteristics decrease the reliability of high-performance architectures. The erroneous data produced by faulty processors could have damaging consequences, particularly in critical real-time applications. It is therefore desirable that any erroneous data produced by the system be detected and located as quickly as possible. Algorithm-based fault tolerance (ABFT) is a low-cost system-level concurrent error detection and fault location scheme. Methods used in the analysis of multiprocessor systems using system-level diagnosis are applied to the analysis of ABFT systems. A new algorithm for analyzing an ABFT system for its fault diagnosability is developed using these methods. Based on this work, a fault diagnosis algorithm is developed for ABFT systems.