IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
An Approach to Software Assisted Recovery from Hardware Transient Faults for Real Time Systems
SAFECOMP '00 Proceedings of the 19th International Conference on Computer Safety, Reliability and Security
Time-bounded cooperative recovery with the distributed real-time conversation scheme
WORDS '97 Proceedings of the 3rd Workshop on Object-Oriented Real-Time Dependable Systems - (WORDS '97)
Detection of Response Time Failures of Real-Time Software
ISSRE '97 Proceedings of the Eighth International Symposium on Software Reliability Engineering
Carrying goals to newcastle: a tribute to brian randell
Dependable and Historic Computing
Hi-index | 0.01 |
Real-time systems often have very high reliability requirements and are therefore prime candidates for the inclusion of fault tolerance techniques. In order to provide tolerance to software faults, some form of state restoration is usually advocated as a means of recovery. State restoration can be expensive and the cost is exacerbated for systems which utilize concurrent processes. The concurrency present in most real-time systems and the further difficulties introduced by timing constraints suggest that providing tolerance for software faults may be inordinately expensive or complex. We believe that this need not be the case, and propose a straightforward pragmatic approach to software fault tolerance'which is believed to be applicable to many real-time systems. The approach takes advantage of the structure of real-time systems to simplify error recovery, and a classification scheme for errors is introduced. Responses to each type of error are proposed which allow service to be maintained.