A Fault-Tolerant Scheduling Algorithm for Real-Time Periodic Tasks with Possible Software Faults

Authors:
Ching-Chih Han;Kang G. Shin;Jian Wu
Affiliations:
-;-;-
Venue:
IEEE Transactions on Computers
Year:
2003

Citing 11
Cited 28

A fault-tolerant scheduling problem

IEEE Transactions on Software Engineering
Misconceptions About Real-Time Computing: A Serious Problem for Next-Generation Systems

Computer
Some Results of the Earliest Deadline Scheduling Algorithm

IEEE Transactions on Software Engineering
Scheduling periodic and aperiodic tasks in hard real-time computing systems

SIGMETRICS '91 Proceedings of the 1991 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Dynamic priority scheduling of periodic and aperiodic tasks in hard real-time systems

Real-Time Systems
Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment

Journal of the ACM (JACM)
Priority Inheritance Protocols: An Approach to Real-Time Synchronization

IEEE Transactions on Computers
Timing Analysis for Fixed-Priority Scheduling of Hard Real-Time Systems

IEEE Transactions on Software Engineering
Enhancing real-time schedules to tolerate transient faults

RTSS '95 Proceedings of the 16th IEEE Real-Time Systems Symposium
Dependable System Upgrade

RTSS '98 Proceedings of the IEEE Real-Time Systems Symposium
Structuring real-time systems using performance polymorphism

Structuring real-time systems using performance polymorphism

Efficient overloading techniques for primary-backup scheduling in real-time systems

Journal of Parallel and Distributed Computing
Design Optimization of Time-and Cost-Constrained Fault-Tolerant Distributed Embedded Systems

Proceedings of the conference on Design, Automation and Test in Europe - Volume 2
Synthesis of fault-tolerant schedules with transparency/performance trade-offs for distributed embedded systems

Proceedings of the conference on Design, automation and test in Europe: Proceedings
Performance effective pre-scheduling strategy for heterogeneous grid systems in the master slave paradigm

Future Generation Computer Systems
Exact Fault-Sensitive Feasibility Analysis of Real-Time Tasks

IEEE Transactions on Computers
Real-time Task Scheduling Using Extended Overloading Technique for Multiprocessor Systems

DS-RT '07 Proceedings of the 11th IEEE International Symposium on Distributed Simulation and Real-Time Applications
FLARe: a Fault-tolerant Lightweight Adaptive Real-time middleware for distributed real-time and embedded systems

Proceedings of the 4th on Middleware doctoral symposium
On improving resource utilization and system throughput of master slave job scheduling in heterogeneous systems

The Journal of Supercomputing
Scheduling of fault-tolerant embedded systems with soft and hard timing constraints

Proceedings of the conference on Design, automation and test in Europe
Synthesis of fault-tolerant embedded systems

Proceedings of the conference on Design, automation and test in Europe
A Fault-Tolerant Real-Time Scheduling Algorithm in Software Fault-Tolerant Module

ICCS '07 Proceedings of the 7th international conference on Computational Science, Part IV: ICCS 2007
Dynamic Scheduling Real-Time Task Using Primary-Backup Overloading Strategy for Multiprocessor Systems

IEICE - Transactions on Information and Systems
Deadline fault tolerance in a networked real-time system

Proceedings of the 2006 conference on Advances in Intelligent IT: Active Media Technology 2006
Fault-tolerant Gang Scheduling in Distributed Real-time Systems Utilizing Imprecise Computations

Simulation
Design optimization of time-and cost-constrained fault-tolerant embedded systems with checkpointing and replication

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Scheduling multiple task graphs with end-to-end deadlines in distributed real-time systems utilizing imprecise computations

Journal of Systems and Software
Analysis and optimization of fault-tolerant embedded systems with hardened processors

Proceedings of the Conference on Design, Automation and Test in Europe
Boosting adaptivity of fault-tolerant scheduling for real-time tasks with service requirements on clusters

Journal of Systems and Software
The impact of input error on the scheduling of task graphs with imprecise computations in heterogeneous distributed real-time systems

ASMTA'11 Proceedings of the 18th international conference on Analytical and stochastic modeling techniques and applications
Analysis and optimization of fault-tolerant task scheduling on multiprocessor embedded systems

CODES+ISSS '11 Proceedings of the seventh IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Dual-mode r-reliable task model for flexible scheduling in reliable real-time systems

EUC'06 Proceedings of the 2006 international conference on Embedded and Ubiquitous Computing
Adaptive energy-efficient scheduling for real-time tasks on DVS-enabled heterogeneous clusters

Journal of Parallel and Distributed Computing
Scheduling real-time DAGs in heterogeneous clusters by combining imprecise computations and bin packing techniques for the exploitation of schedule holes

Future Generation Computer Systems
Fault Resilient Real-Time Design for NoC Architectures

ICCPS '12 Proceedings of the 2012 IEEE/ACM Third International Conference on Cyber-Physical Systems
Scheduling and Optimization of Fault-Tolerant Embedded Systems with Transparency/Performance Trade-Offs

ACM Transactions on Embedded Computing Systems (TECS)
The BGW model for QoS aware scheduling of real-time embedded systems

Proceedings of the 11th ACM international symposium on Mobility management and wireless access
Fault-tolerant scheduling in homogeneous real-time systems

ACM Computing Surveys (CSUR)
Fault-tolerant hierarchical real-time scheduling with backup partitions on single processor

ACM SIGBED Review - Special Issue on the 5th Workshop on Adaptive and Reconfigurable Embedded Systems

Quantified Score

Hi-index	14.98

Visualization

Abstract

A hard real-time system is usually subject to stringent reliability and timing constraints since failure to produce correct results in a timely manner may lead to a disaster. One way to avoid missing deadlines is to trade the quality of computation results for timeliness and software fault tolerance is often achieved with the use of redundant programs. A deadline mechanism which combines these two methods is proposed to provide software fault tolerance in hard real-time periodic task systems. Specifically, we consider the problem of scheduling a set of real-time periodic tasks each of which has two versions: primary and alternate. The primary version contains more functions (thus more complex) and produces good quality results, but its correctness is more difficult to verify because of its high level of complexity and resource usage. By contrast, the alternate version contains only the minimum required functions (thus simpler) and produces less precise, but acceptable results and its correctness is easy to verify. We propose a scheduling algorithm which 1) guarantees either the primary or alternate version of each critical task to be completed in time and 2) attempts to complete as many primaries as possible. Our basic algorithm uses a fixed priority-driven preemptive scheduling scheme to preallocate time intervals to the alternates and, at runtime, attempts to execute primaries first. An alternate will be executed only 1) if its primary fails due to lack of time or manifestation of bugs or 2) when the latest time to start execution of the alternate without missing the corresponding task deadline is reached. This algorithm is shown to be effective and easy to implement. This algorithm is enhanced further to prevent early failures in executing primaries from triggering failures in the subsequent job executions, thus improving efficiency of processor usage.