Synthesis and simulation of digital systems containing interacting hardware and software components
DAC '92 Proceedings of the 29th ACM/IEEE Design Automation Conference
Error control systems for digital communication and storage
Error control systems for digital communication and storage
Transient fault tolerant processing in a RF application
Systems Analysis Modelling Simulation
Fault Injection Techniques and Tools
Computer
An Integrated HW and SW Fault Injection Environment for Real-Time Systems
DFT '98 Proceedings of the 13th International Symposium on Defect and Fault-Tolerance in VLSI Systems
Experimental evaluation of the fail-silent behaviour in programs with consistency checks
FTCS '96 Proceedings of the The Twenty-Sixth Annual International Symposium on Fault-Tolerant Computing (FTCS '96)
Guest Editors' Introduction: Design for Yield and Reliability
IEEE Design & Test
IBM S/390 parallel enterprise server G5 fault tolerance: a historical perspective
IBM Journal of Research and Development
Software fault avoidance issues
Ubiquity
Hi-index | 0.00 |
This paper describes how to design a software-based fault tolerant application using microprocessor (MP), in order to tolerate the burst errors in memory. This approach may be called a single -- version scheme (SVS). The SVS relies on a single version application program which is enhanced with self-checking code redundancy to tolerate memory burst errors that are difficult to correct during the run-time of an application. Conventionally, the other software based approaches can detect a few bit errors (in memory) only towards fail-stop kind of fault tolerance against transient bit errors. Reed Solomon codes are mainly effective for burst errors in coding of audio Compact Disks at offline only. The proposed online technique does not need multiple versions of software and multiple machines. This approach employs only two copies of the application software running on one machine only. Two copies of the enhanced version version of an application are used here for online error detection and tolerance thereof as well. This is an effective low-cost online tool for hardening a microprocessor-based industrial computing system or for on-chip DRAM applications using an affordable code and time redundancy against the burst errors in processor memory. The SVS aims to provide a non-fail-stop kind of fault tolerance against burst errors. This approach supplements the Error Correcting Codes (ECC) in memory system also, against both the transient and permanent bit errors in memory.