A case for redundant arrays of inexpensive disks (RAID)
SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
An Approach to Software Assisted Recovery from Hardware Transient Faults for Real Time Systems
SAFECOMP '00 Proceedings of the 19th International Conference on Computer Safety, Reliability and Security
A program structure for error detection and recovery
Operating Systems, Proceedings of an International Symposium
State Restoration in a COTS-Based N-Modular Architecture
ISORC '98 Proceedings of the The 1st IEEE International Symposium on Object-Oriented Real-Time Distributed Computing
A safety-related PES for task-oriented real-time execution without asynchronous interrupts
SAFECOMP'05 Proceedings of the 24th international conference on Computer Safety, Reliability, and Security
Hi-index | 0.00 |
Controlling safety-critical real-time applications that cannot immediately be transferred to a safe state requires highly reliable Programmable Electronic Systems (PESs). This demand for fault-tolerance is usually satisfied by applying redundant processing structures inside each PES and, additionally, configuring multiple PES redundantly. Instead of minimising the failure probability of single PESs, it is also desirable to provide a redundant configuration of PESs with the capability to re-start single units at runtime. This requires copying a PES's internal state at runtime, since a re-started unit must equalise its internal state with that of its redundant counterparts before the redundant processing can be rejoined. As a result, redundancy attrition due to transient faults is prevented, since failed channels can be brought back on line. This article states the problems concerned with runtime state restoration of realtime systems, discusses the advantages and disadvantages of existing techniques and introduces a hardwaresupported state restoration concept.