Fault-tolerant systems with concurrent error-locating capability

Authors:
JianHui Jiang;YingHua Min;ChengLian Peng
Affiliations:
Department of Computer Science and Technology, Tongji University, Shanghai 200092, P.R. China and Department of Computing and Information Technology, Fudan University, Shanghai 200433, P.R. China;Institute of Computing Technology, The Chinese Academy of Sciences, Beijing 100080, P.R. China;Department of Computing and Information Technology, Fudan University, Shanghai 200433, P.R. China
Venue:
Journal of Computer Science and Technology
Year:
2003

Citing 10
Cited 1

Fault tolerant and fault testable hardware design

Fault tolerant and fault testable hardware design
The Design of Totally Self-Checking TMR Fault-Tolerant Systems

IEEE Transactions on Computers
Error-control coding for computer systems

Error-control coding for computer systems
A General Constructive Approach to Fault-Tolerant Design Using Redundancy

IEEE Transactions on Computers
A Reliable Fail-Safe System

IEEE Transactions on Computers
The Design of TSC Error C/D Circuits for SEC/DED Codes

IEEE Transactions on Computers
Highly Reliable Systems with Differential Built-In Current Sensors

DFT '98 Proceedings of the 13th International Symposium on Defect and Fault-Tolerance in VLSI Systems
Reliable Logic Circuits with Byte Error Control Codes: A Feasibility Study

DFT '96 Proceedings of the 1996 Workshop on Defect and Fault-Tolerance in VLSI Systems
A Novel NMR Structure with Concurrent Error Location Capabilities

PRDC '99 Proceedings of the 1999 Pacific Rim International Symposium on Dependable Computing
14.2 Applying Built-In Self-Test to Majority Voting Fault Tolerant Circuits

VTS '98 Proceedings of the 16th IEEE VLSI Test Symposium

An error recoverable structure based on complementary logic and alternating-retry

Journal of Computer Science and Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

Fault-tolerant systems have found wide applications in military, industrial and commercial areas. Most of these systems are constructed by multiple-modular redundancy or error control coding techniques. They need some fault-tolerant specific components (such as voter, switcher, encoder, or decoder) to implement error-detecting or error-correcting functions. However, the problem of error detection, location or correction for fault-tolerance specific components themselves has not been solved properly so far. Thus, the dependability of a whole fault-tolerant system will be greatly affected. This paper presents a theory of robust fault-masking digital circuits for characterizing fault-tolerant systems with the ability of concurrent error location and a new scheme of dual-modular redundant systems with partially robust fault-masking property. A basic robust fault-masking circuit is composed of a basic functional circuit and an error-locating corrector. Such a circuit not only has the ability of concurrent error correction, but also has the ability of concurrent error location. According to this circuit model, for a partially robust fault-masking dual-modular redundant system, two redundant modules based on alternating-complementary logic consist of the basic functional circuit. An error-correction specific circuit named as alternating-complementary corrector is used as the error-locating corrector. The performance (such as hardware complexity, time delay) of the scheme is analyzed.