Measurement and modeling of computer reliability as affected by system activity
ACM Transactions on Computer Systems (TOCS)
Tolerance to Multiple Transient Faults for Aperiodic Tasks in Hard Real-Time Systems
IEEE Transactions on Computers
Transient Fault Tolerance in Digital Systems
IEEE Micro
A Fault-Tolerant Scheduling Algorithm for Real-Time Periodic Tasks with Possible Software Faults
IEEE Transactions on Computers
Proceedings of the conference on Design, automation and test in Europe - Volume 2
Reliability-Aware Co-Synthesis for Embedded Systems
ASAP '04 Proceedings of the Application-Specific Systems, Architectures and Processors, 15th IEEE International Conference
Design Optimization of Time-and Cost-Constrained Fault-Tolerant Distributed Embedded Systems
Proceedings of the conference on Design, Automation and Test in Europe - Volume 2
Mapping Applications to Tiled Multiprocessor Embedded Systems
ACSD '07 Proceedings of the Seventh International Conference on Application of Concurrency to System Design
CODES+ISSS '07 Proceedings of the 5th IEEE/ACM international conference on Hardware/software codesign and system synthesis
Symbolic voter placement for dependability-aware system synthesis
CODES+ISSS '08 Proceedings of the 6th IEEE/ACM/IFIP international conference on Hardware/Software codesign and system synthesis
Reliability-Aware Energy Management for Periodic Real-Time Tasks
IEEE Transactions on Computers
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Enhanced reliability-aware power management through shared recovery technique
Proceedings of the 2009 International Conference on Computer-Aided Design
Towards the Design of Certifiable Mixed-criticality Systems
RTAS '10 Proceedings of the 2010 16th IEEE Real-Time and Embedded Technology and Applications Symposium
Task Mapping and Bandwidth Reservation for Mixed Hard/Soft Fault-Tolerant Embedded Systems
RTAS '10 Proceedings of the 2010 16th IEEE Real-Time and Embedded Technology and Applications Symposium
Analysis and optimization of fault-tolerant embedded systems with hardened processors
Proceedings of the Conference on Design, Automation and Test in Europe
System-level reliability modeling for MPSoCs
CODES/ISSS '10 Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
A task remapping technique for reliable multi-core embedded systems
CODES/ISSS '10 Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Opt4J: a modular framework for meta-heuristic optimization
Proceedings of the 13th annual conference on Genetic and evolutionary computation
Reliability-Aware Design Optimization for Multiprocessor Embedded Systems
DSD '11 Proceedings of the 2011 14th Euromicro Conference on Digital System Design
Transparent recovery from intermittent faults in time-triggered distributed systems
IEEE Transactions on Computers
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Quasi-static fault-tolerant scheduling schemes for energy-efficient hard real-time systems
Journal of Systems and Software
Towards fault-tolerant embedded systems with imperfect fault detection
Proceedings of the 49th Annual Design Automation Conference
Self-Adaptive Fault Tolerance in Multi-/Many-Core Systems
Journal of Electronic Testing: Theory and Applications
Proceedings of the Conference on Design, Automation and Test in Europe
Energy-aware task mapping and scheduling for reliable embedded computing systems
ACM Transactions on Embedded Computing Systems (TECS) - Special Section ESFH'12, ESTIMedia'11 and Regular Papers
Aging-aware hardware-software task partitioning for reliable reconfigurable multiprocessor systems
Proceedings of the 2013 International Conference on Compilers, Architectures and Synthesis for Embedded Systems
Hi-index | 0.00 |
Reliability is a major requirement for most safety-related systems. To meet this requirement, fault-tolerant techniques such as hardware replication and software re-execution are often utilized. In this paper, we tackle the problem of analysis and optimization of fault-tolerant task scheduling for multiprocessor embedded systems. A set of existing fault- and process-models are adopted and a Binary Tree Analysis (BTA) is proposed to compute the system-level reliability in the presence of software/hardware redundancy. The BTA is integrated into a multi-objective evolutionary algorithm via a two-step encoding to perform reliability-aware design optimization. The optimization results contain the mapping of tasks to processing elements, the exact task and message schedule and the fault-tolerance policy assignment. Based on the observation that permanent faults need to be considered together with transient faults to achieve optimal system design, we propose a virtual mapping technique to take both types of faults into account. To the best of our knowledge, this is the first approach in fault-tolerant task scheduling that considers permanent and transient faults in a unified manner. The effectiveness of our approach is illustrated using several case studies.