Bounds on Algorithm-Based Fault Tolerance in Multiple Processor Systems
IEEE Transactions on Computers - The MIT Press scientific computation series
An analysis of algorithm-based fault tolerance techniques
Journal of Parallel and Distributed Computing
Determining performance measures of algorithm-based fault tolerant systems
Journal of Parallel and Distributed Computing
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
Improved Bounds for Algorithm-Based Fault Tolerance
IEEE Transactions on Computers
Optimal Design of Checks for Error Detection and Location in Fault Tolerant Multiprocessors Systems
Proceedings of the 5th International GI/ITG/GMA Conference on Fault-Tolerant Computing Systems, Tests, Diagnosis, Fault Treatment
Combinatorial Analysis of Check Set Construction for Algorithm-Based Fault Tolerance Systems
Journal of Electronic Testing: Theory and Applications
Using Data Flow Information to Obtain Efficient Check Sets for Algorithm-Based Fault Tolerance
International Journal of Parallel Programming
An Efficient Algorithm-Based Fault Tolerance Design Using the Weighted Data-Check Relationship
IEEE Transactions on Computers
Method for designing and placing check sets based on control flow analysis of programs
ISSRE '96 Proceedings of the The Seventh International Symposium on Software Reliability Engineering
Hi-index | 14.99 |
Algorithm-based fault tolerance (ABFT) is a popular approach to achieve fault and error detection in multiprocessor systems. The design problem for ABFT is concerned with the construction of a check set of minimum cardinality that detects a specified number of errors or faults. Previous work on this problem has assumed an a priori bound on the size of a check. We motivate and carry out an investigation of the problem without the bounded check size assumption. We establish upper and lower bounds on the number of checks needed to detect a given number of errors. The upper bounds are obtained through new schemes which are easy to implement, and the lower bounds are established using new types of arguments. These bounds are sharply different from those previously established under the bounded check size model. We also show that unlike error detection, the design problem for fault detection is NP-hard even for detecting only one fault.