Fault tolerant and fault testable hardware design
Fault tolerant and fault testable hardware design
Software implementation of a recursive fault tolerance algorithm on a network of computers
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Roll-Forward and Rollback Recovery: Performance-Reliability Trade-Off
IEEE Transactions on Computers - Special issue on mobile computing
Performance Optimization of Checkpointing Schemes with Task Duplication
IEEE Transactions on Computers
Analysis of Checkpointing Schemes with Task Duplication
IEEE Transactions on Computers
Threshold-Based Mechanisms to Discriminate Transient from Intermittent Faults
IEEE Transactions on Computers
Roll-Forward Checkpointing Scheme: A Novel Fault-Tolerant Architecture
IEEE Transactions on Computers
Effective Fault Treatment for Improving the Dependability of COTS and Legacy-Based Applications
IEEE Transactions on Dependable and Secure Computing
Towards Nanoelectronics Processor Architectures
Journal of Electronic Testing: Theory and Applications
Online Diagnosis and Recovery: On the Choice and Impact of Tuning Parameters
IEEE Transactions on Dependable and Secure Computing
Hi-index | 14.99 |
An algorithm called RAFT (recursive algorithm for fault tolerance) for achieving fault tolerance in multiprocessor systems is described. Through the use of a combination of dynamic space- and time- redundancy techniques, RAFT achieves fault tolerance in the presence of permanent as well as intermittent faults. Performance and reliability of multiprocessor systems using RAFT are determined as a function of individual processor reliability and the total number of fault modes in a processor. RAFT-based systems are superior to triple modular redundancy (TMR) systems in hardware economy and provide comparable reliability. A multiprocessor architecture adopting RAFT is given.