Fault-Tolerant Computing: An Introduction and a Viewpoint
IEEE Transactions on Computers
An Approach to the Diagnosis of Intermittent Faults
IEEE Transactions on Computers
On-Line Diagnosis of Unrestricted Faults
IEEE Transactions on Computers
Fault Masking in Combinational Logic Circuits
IEEE Transactions on Computers
A Totally Self-Checking Checker Design for the Detection of Errors in Periodic Signals
IEEE Transactions on Computers
An Advanced Fault Isolation System for Digital Logic
IEEE Transactions on Computers
The Architectural Elements of a Symmetric Fault-Tolerant Multiprocessor
IEEE Transactions on Computers
A Damage- and Fault-Tolerant Input/Output Network
IEEE Transactions on Computers
Orthogonal Latin Square Configuration for LSI Memory Yield and Reliability Enhancement
IEEE Transactions on Computers
A Reliability Model for Gracefully Degrading and Standby-Sparing Systems
IEEE Transactions on Computers
Reliability Modeling of Compensating Module Failures in Majority Voted Redundancy
IEEE Transactions on Computers
The Probability of a Correct Output from a Combinational Circuit
IEEE Transactions on Computers
Some Problems in Certifying Microprograms
IEEE Transactions on Computers
Methodology for the Generation of Program Test Data
IEEE Transactions on Computers
Diversified Test Methods for Local Control Units
IEEE Transactions on Computers
Design of Reliable Synchronous Sequential Circuits
IEEE Transactions on Computers
Transient Failures in Triple Modular Redundancy Systems with Sequential Modules
IEEE Transactions on Computers
Analysis of Logic Circuits with Faults Using Input Signal Probabilities
IEEE Transactions on Computers
A Combinatorial Solution to the Reliability of Interwoven Redundant Logic Networks
IEEE Transactions on Computers
A Module-Level Testing Approach for Combinational Networks
IEEE Transactions on Computers
Hi-index | 14.98 |
FAULT-TOLERANT computing has been defined as "the ability to execute specified algorithms correctly regardless of hardware failures, total system flaws, or program fallacies" [1]. To the extent that a system falls short of meeting the requirements of this definition, it can be labeled a partially fault-tolerant system [2]. Thus the definition of fault-tolerant computing provides a standard against which to measure all systems having a degree of fault tolerance. In particular, one can classify systems according to: 1), the amount of manual intervention required in performing three basic functions, and 2) the class of faults covered by three basic functions involved in fault tolerance: system validation, fault diagnosis, and fault masking or recovery. The word "fault" here is used to inclusively describe "failures, flaws, and fallacies" in the original definition. The first function is involved in the design and production of the system hardware and software, while the last two functions are embodied in the system itself. Likewise, the first function is directed to handling faults arising from design and production errors, whereas the last two functions are aimed at faults due to random hardware failures.