Implementing fault-tolerance in real-time systems by automatic program transformations

Authors:
Tolga Ayav;Pascal Fradet;Alain Girault
Affiliations:
INRIA Rhône-Alpes, Saint-Ismier cedex, France;INRIA Rhône-Alpes, Saint-Ismier cedex, France;INRIA Rhône-Alpes, Saint-Ismier cedex, France
Venue:
EMSOFT '06 Proceedings of the 6th ACM & IEEE International conference on Embedded software
Year:
2006

Citing 22
Cited 0

A fault-tolerant scheduling problem

IEEE Transactions on Software Engineering
Understanding fault-tolerant distributed systems

Communications of the ACM
Semantics with applications: a formal introduction

Semantics with applications: a formal introduction
Fault tolerance in distributed systems

Fault tolerance in distributed systems
Impossibility of distributed consensus with one faulty process

Journal of the ACM (JACM)
Unreliable failure detectors for reliable distributed systems

Journal of the ACM (JACM)
Optimized rapid prototyping for real-time embedded heterogeneous multiprocessors

CODES '99 Proceedings of the seventh international workshop on Hardware/software codesign
Guest Editorial: A Review of Worst-Case Execution-TimeAnalysis

Real-Time Systems - Special issue on worst-case execution-time analysis
Real-Time Systems: Design Principles for Distributed Embedded Applications

Real-Time Systems: Design Principles for Distributed Embedded Applications
A survey of rollback-recovery protocols in message-passing systems

ACM Computing Surveys (CSUR)
The Definition of Standard ML

The Definition of Standard ML
Transforming Execution-Time Boundable Code into Temporally Predictable Code

DIPES '02 Proceedings of the IFIP 17th World Computer Congress - TC10 Stream on Distributed and Parallel Embedded Systems: Design and Analysis of Distributed Embedded Systems
Synchronous Programming of Reactive Systems

CAV '98 Proceedings of the 10th International Conference on Computer Aided Verification
Automated application-level checkpointing of MPI programs

Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
Dynamic Matching and Scheduling of a Class of Independent Tasks onto Heterogeneous Computing Systems

HCW '99 Proceedings of the Eighth Heterogeneous Computing Workshop
Analysis of checkpointing for schedulability of real-time systems

RTCSA '97 Proceedings of the 4th International Workshop on Real-Time Computing Systems and Applications
Optimal scheduling of imprecise computation tasks in the presence of multiple faults

RTCSA '00 Proceedings of the Seventh International Conference on Real-Time Systems and Applications
System-Level Versus User-Defined Checkpointing

SRDS '98 Proceedings of the The 17th IEEE Symposium on Reliable Distributed Systems
Hardware to Software Migration with Real-Time Thread Integration

EUROMICRO '98 Proceedings of the 24th Conference on EUROMICRO - Volume 1
A Nonpreemptive Real-Time Scheduler with Recovery from Transient Faults and Its Implementation

IEEE Transactions on Software Engineering
Automated Synthesis of Multitolerance

DSN '04 Proceedings of the 2004 International Conference on Dependable Systems and Networks
Modeling control speculation for timing analysis

Real-Time Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a formal approach to implement and certify fault-tolerance in real-time embedded systems. The fault-intolerant initial system consists of a set of independent periodic tasks scheduled onto a set of fail-silent processors. We transform the tasks such that, assuming the availability of an additional spare processor, the system tolerates one failure at a time (transient or permanent). Failure detection is implemented using heartbeating, and failure masking using checkpointing and roll-back. These techniques are described and implemented by automatic program transformations on the tasks' programs. The proposed formal approach to fault-tolerance by program transformation highlights the benefits of separation of concerns and allows us to establish correctness properties.