Generalized Algorithm-Based Fault Tolerance: Error Correction via Kalman Estimation

Authors:
G. Robert Redinbo
Affiliations:
Univ. of California, Davis
Venue:
IEEE Transactions on Computers
Year:
1998

Citing 9
Cited 8

Kalman filtering theory

Kalman filtering theory
A Linear Algebraic Model of Algorithm-Based Fault Tolerance

IEEE Transactions on Computers
Design & analysis of fault tolerant digital systems

Design & analysis of fault tolerant digital systems
Discrete-time signal processing

Discrete-time signal processing
Real-Number Codes for Fault-Tolerant Matrix Operations on Processor Arrays

IEEE Transactions on Computers
Kalman filtering: with real-time applications (2nd ed.)

Kalman filtering: with real-time applications (2nd ed.)
Error-Correction Coding for Digital Communications

Error-Correction Coding for Digital Communications
Principles of Digital Communication and Coding

Principles of Digital Communication and Coding
Error Control Coding, Second Edition

Error Control Coding, Second Edition

An Efficient Algorithm-Based Fault Tolerance Design Using the Weighted Data-Check Relationship

IEEE Transactions on Computers
Concurrent Error Detection in Fast Unitary Transform Algorithms

DSN '01 Proceedings of the 2001 International Conference on Dependable Systems and Networks (formerly: FTCS)
Low-power MIMO signal processing

IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special section on the 2001 international conference on computer design (ICCD)
Failure-Detecting Arithmetic Convolutional Codes and an Iterative Correcting Strategy

IEEE Transactions on Computers
Concurrent Error Detection in Wavelet Lifting Transforms

IEEE Transactions on Computers
Fault Tolerance Design in JPEG 2000 Image Compression System

IEEE Transactions on Dependable and Secure Computing
Optimal real number codes for fault tolerant matrix operations

Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Periodic and non-concurrent error detection and identification in one-hot encoded FSMs

Automatica (Journal of IFAC)

Quantified Score

Hi-index	14.99

Visualization

Abstract

An extension to Algorithm-Based Fault Tolerance (ABFT) methodologies shows how parity values dictated by a real convolutional code can be employed by Kalman estimation techniques to perform real number correction for protecting linear processing systems. Intermittent failures appearing in the output samples are detected and corrected using only the syndromes normally generated in ABFT schemes. The algebraic structure of a real convolutional code provides separation needed by recursive Kalman state estimators to affect mean-square error correction. State and parity measurement equations model faults and computational noise in both the linear processing and parity generation subassemblies, and, in a departure from previous models, the noise sources are considered time-varying. The Kalman one-step estimator which makes decisions on all parity values up to the present point is determined, and it separates naturally into detection and correction operations permitting corrective action only when the detection levels exceed thresholds based on roundoff noise energy. The detector/corrector uses efficient multirate block processing techniques as determined by the real convolutional code.A smoothed fixed-lag Kalman estimator which uses parity values for a fixed amount beyond the point of interest is needed to complete the correction. It employs one-step estimator quantities and implementation simplifications are possible. Examples showing the correction behavior and mean-square error performance are presented, and the size of overhead calculations for detection and correction is estimated. A protected processing system is constructed by introducing additional subassemblies, mostly comparators, with the detection and correction parts already described. Under the usual assumptions of at most a single subassembly failure, no improperly detected or corrected data leave the overall protected configuration.