Containment units: a hierarchically composable architecture for adaptive systems
Proceedings of the 10th ACM SIGSOFT symposium on Foundations of software engineering
Containment units: a hierarchically composable architecture for adaptive systems
ACM SIGSOFT Software Engineering Notes
An Adaptive Scheme for Fault-Tolerant Scheduling of Soft Real-Time Tasks in Multiprocessor Systems
HiPC '01 Proceedings of the 8th International Conference on High Performance Computing
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Efficient overloading techniques for primary-backup scheduling in real-time systems
Journal of Parallel and Distributed Computing
A Dependability-Driven System-Level Design Approach for Embedded Systems
Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
An adaptive scheme for fault-tolerant scheduling of soft real-time tasks in multiprocessor systems
Journal of Parallel and Distributed Computing
Verifying the adaptation behavior of embedded systems
Proceedings of the 2006 international workshop on Self-adaptation and self-managing systems
Proceedings of the 4th on Middleware doctoral symposium
Towards middleware for fault-tolerance in distributed real-time and embedded systems
DAIS'08 Proceedings of the 8th IFIP WG 6.1 international conference on Distributed applications and interoperable systems
Incorporating graceful degradation into embedded system design
Proceedings of the Conference on Design, Automation and Test in Europe
An adaptive fault tolerance scheme for applications on real-time embedded system
ICESS'04 Proceedings of the First international conference on Embedded Software and Systems
Architecting and implementing versatile dependability
Architecting Dependable Systems III
Adaptive energy-efficient scheduling for real-time tasks on DVS-enabled heterogeneous clusters
Journal of Parallel and Distributed Computing
Fault-tolerant scheduling in homogeneous real-time systems
ACM Computing Surveys (CSUR)
Hi-index | 0.00 |
Static redundancy allocation is inappropriate in hard real-time systems that operate in variable and dynamic environments, (e.g., radar tracking, avionics). Adaptive fault tolerance (AFT) can assure adequate reliability of critical modules, under temporal and resource constraints, by allocating just as much redundancy to less critical modules as can be afforded thus gracefully reducing their resource requirement. We propose a mechanism for supporting adaptive fault tolerance in a real-time system. Adaptation is achieved by choosing a suitable redundancy strategy for a dynamically arriving computation to assure required reliability and to maximize the potential for fault tolerance while ensuring that deadlines are met. The proposed approach is evaluated using a real-life workload simulating radar tracking software in AWACS early warning aircraft. The results demonstrate that our technique outperforms static fault tolerance strategies in terms of tasks meeting their timing constraints. Further, we show that the gain in this timing-centric performance metric does not reduce the fault tolerance of the executing task below a predefined minimum level. Overall, the evaluation indicates that the proposed ideas result in a system that dynamically provides QoS guarantees along the fault-tolerance dimension.