The reliability of computer memories
Scientific American
Code Constructions for Error Control in Byte Organized Memory Systems
IEEE Transactions on Computers
Error-Correcting Codes with Byte Error-Detection Capability
IEEE Transactions on Computers
A Class of Odd-Weight-Column SEC-DED-SbED Codes for Memory System Applications
IEEE Transactions on Computers
(N, K) Concept Fault Tolerance
IEEE Transactions on Computers - The MIT Press scientific computation series
Review: A survey of memory error correcting techniques for improved reliability
Journal of Network and Computer Applications
Method for formal verification of soft-error tolerance mechanisms in pipelined microprocessors
ICFEM'10 Proceedings of the 12th international conference on Formal engineering methods and software engineering
Performance analysis of error-correcting binary decision diagrams
EUROCAST'11 Proceedings of the 13th international conference on Computer Aided Systems Theory - Volume Part II
Hi-index | 14.98 |
A well-known technique for providing tolerance against single hardware component failures is triplication of the component, called triple modular redundancy (TMR). In this paper a component is taken to be a processor-memory configuration where the memory is organized in a bit-sliced way. If voting is performed bitwise in an orthodox TMR configuration consisting of three of these components, failure of a complete component or failure of bit-slices not on corresponding positions in the memories can be tolerated. We present a TMR technique, not using more redundancy than orthodox TMR, that can tolerate the failure of arbitrary bit-slices (including those on corresponding positions) up to a certain amount. Additionally it can tolerate the failure of arbitrary bit-slices up to a certain amount whenever one component is known to be malfunctioning or whenever one component is disabled. This generalized TMR technique is described for processor-memory configurations processing 4-, 8-, and 16-bit words, respectively.