Implementation of an Experimental Fault-Tolerant Memory System

Authors:
W. C. Carter;C. E. McCarthy
Affiliations:
IBM Thomas J. Watson Research Center;-
Venue:
IEEE Transactions on Computers
Year:
1976

Citing 5
Cited 11

Modeling of a Bubble-Memory Organization with Self-Checking Translators to Achieve High Reliability

IEEE Transactions on Computers
Lookaside Techniques for Minimum Circuit Memory Translators

IEEE Transactions on Computers
The engineering design of the stretch computer

IRE-AIEE-ACM '59 (Eastern) Papers presented at the December 1-3, 1959, eastern joint IRE-AIEE-ACM computer conference
Fault location in memory systems by program

AFIPS '69 (Spring) Proceedings of the May 14-16, 1969, spring joint computer conference
A class of optimal minimum odd-weight-column SEC-DED codes

IBM Journal of Research and Development

A concept for test and reconfiguration of a fault-tolerant VLSI processor system

ISCA '80 Proceedings of the 7th annual symposium on Computer Architecture
Error Correction by Alternate-Data Retry

IEEE Transactions on Computers
Acceptable Testing of VLSI Components Which Contain Error Correctors

IEEE Transactions on Computers
A Class of Linear Codes for Error Control in Byte-per-Card Organized Digital Systems

IEEE Transactions on Computers
Coding for Random-Access Memories

IEEE Transactions on Computers
Fault-Tolerant Computing: A Introduction

IEEE Transactions on Computers
Erasure and Error Decoding for Semiconductor Memories

IEEE Transactions on Computers
A system solution to the memory soft error problem

IBM Journal of Research and Development
Model for transient and permanent error-detection and fault-isolation coverage

IBM Journal of Research and Development
Error-correcting codes for semiconductor memory applications: a state-of-the-art review

IBM Journal of Research and Development
Implementation and evaluation of a (b,k)-adjacent error-correcting/detecting scheme for supercomputer systems

IBM Journal of Research and Development

Quantified Score

Hi-index	15.00

Visualization

Abstract

The experimental fault-tolerant memory system described in this paper has been designed to enable the modular addition of spares, to validate the theoretical fault-secure and self-testing properties of the translator/corrector, to provide a basis for experiments using the new testing and correction processes for recovery, and to determine the practicality of such systems. The hardware design and implementation are described, together with methods of fault insertion. The hardware/ software interface, including a restricted single error correction/double error detection (SEC/DED) code, is specified. Procedures are carefully described which, 1) test for specified physical faults, 2) ensure that single error corrections are not miscorrections due to triple faults, and 3) enable recovery from double errors.