Self-checking and fault-tolerant digital design
Self-checking and fault-tolerant digital design
ED4I: Error Detection by Diverse Data and Duplicated Instructions
IEEE Transactions on Computers - Special issue on fault-tolerant embedded systems
Embedded Robustness IPs for Transient-Error-Free ICs
IEEE Design & Test
Basic Concepts and Taxonomy of Dependable and Secure Computing
IEEE Transactions on Dependable and Secure Computing
Characterization of Soft Errors Caused by Single Event Upsets in CMOS Processes
IEEE Transactions on Dependable and Secure Computing
Soft Errors in Advanced Computer Systems
IEEE Design & Test
MiBench: A free, commercially representative embedded benchmark suite
WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
IBM S/390 parallel enterprise server G5 fault tolerance: a historical perspective
IBM Journal of Research and Development
Statistical approach in a system level methodology to deal with process variation
CODES/ISSS '10 Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
ACM Transactions on Embedded Computing Systems (TECS) - Special Section on ESTIMedia'10
Hi-index | 0.00 |
We present a hardware-based approach to improve the resilience of a computer system against the errors occurred in the main memory with the help of error detecting and correcting (EDAC) codes. Checksums are placed in the same type of memory locations and addressed in the same way as normal data. Consequently, the checksums are accessible from the exterior of the main memory just as normal data and this enables implicit fault-tolerance for interconnection and solidstate secondary storage sub-systems. A small hardware module is used to manage the sequential retrieval of checksums each time the integrity of the data accessed by the processor sub-system needs to be verified. The proposed approach has the following properties: (a) it is cost efficient since it can be used with simple storage and interconnection sub-systems that do not possess any inherent EDAC mechanism, (b) it allows on-line modifications of the memory protection levels, and (c) no modification of the application software is required.