Task Allocation for Maximizing Reliability of Distributed Computer Systems
IEEE Transactions on Computers
Allocation and Scheduling of Precedence-Related Periodic Tasks
IEEE Transactions on Parallel and Distributed Systems
Fault-Tolerance Through Scheduling of Aperiodic Tasks in Hard Real-Time Multiprocessor Systems
IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Computers
A Fault-Tolerant Dynamic Scheduling Algorithm for Multiprocessor Real-Time Systems and Its Analysis
IEEE Transactions on Parallel and Distributed Systems
Safety and Reliability Driven Task Allocation in Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
Fault-Tolerant Rate-Monotonic First-Fit Scheduling in Hard-Real-Time Systems
IEEE Transactions on Parallel and Distributed Systems
Static scheduling algorithms for allocating directed task graphs to multiprocessors
ACM Computing Surveys (CSUR)
Tolerance to Multiple Transient Faults for Aperiodic Tasks in Hard Real-Time Systems
IEEE Transactions on Computers
Deadline Scheduling for Real-Time Systems: Edf and Related Algorithms
Deadline Scheduling for Real-Time Systems: Edf and Related Algorithms
IEEE Transactions on Parallel and Distributed Systems
Fast Allocation of Processes in Distributed and Parallel Systems
IEEE Transactions on Parallel and Distributed Systems
Dynamic, Reliability-Driven Scheduling of Parallel Real-Time Jobs in Heterogeneous Systems
ICPP '02 Proceedings of the 2001 International Conference on Parallel Processing
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
A New Fault-Tolerant Technique for Improving the Schedulability in Multiprocessor Real-time Systems
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Scheduling of Periodic Time Critical Applications for Pipelined Execution on Heterogeneous Systems
ICPP '02 Proceedings of the 2001 International Conference on Parallel Processing
HCW '99 Proceedings of the Eighth Heterogeneous Computing Workshop
Heterogeneous Resource Management for Dynamic Real-Time Systems
HCW '00 Proceedings of the 9th Heterogeneous Computing Workshop
Fault-Tolerant Real-Time Scheduling using Passive Replicas
PRFTS '97 Proceedings of the 1997 Pacific Rim International Symposium on Fault-Tolerant Systems
Optimal Scheduling for Fault-Tolerant and Firm Real-Time Systems
RTCSA '98 Proceedings of the 5th International Conference on Real-Time Computing Systems and Applications
Fault-Tolerant Real-Time Scheduling under Execution Time Constraints
RTCSA '99 Proceedings of the Sixth International Conference on Real-Time Computing Systems and Applications
A high-level synthesis approach to design of fault-tolerant systems
VTS '97 Proceedings of the 15th IEEE VLSI Test Symposium
ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
ICPP '02 Proceedings of the 2002 International Conference on Parallel Processing
DSN '04 Proceedings of the 2004 International Conference on Dependable Systems and Networks
Unification of Transactions and Replication in Three-Tier Architectures Based on CORBA
IEEE Transactions on Dependable and Secure Computing
Scheduling heterogeneous multimedia servers: different QoS for hard, soft and non real-time clients
Euromicro-RTS'00 Proceedings of the 12th Euromicro conference on Real-time systems
Reliability-aware scheduling strategy for heterogeneous distributed computing systems
Journal of Parallel and Distributed Computing
An efficient weighted bi-objective scheduling algorithm for heterogeneous systems
Euro-Par'09 Proceedings of the 2009 international conference on Parallel processing
Journal of Systems and Software
A hybrid policy for fault tolerant load balancing in grid computing environments
Journal of Network and Computer Applications
Robust partitioning for real-time multiprocessor systems with shared resources
Proceedings of the 2011 ACM Symposium on Research in Applied Computation
Flexible service selection with user-specific QoS support in service-oriented architecture
Journal of Network and Computer Applications
Adaptive energy-efficient scheduling for real-time tasks on DVS-enabled heterogeneous clusters
Journal of Parallel and Distributed Computing
3E: Energy-efficient elastic scheduling for independent tasks in heterogeneous computing systems
Journal of Systems and Software
Hi-index | 0.00 |
Fault-tolerance is an essential requirement for real-time systems, due to potentially catastrophic consequences of faults. In this paper, we investigate an efficient off-line scheduling algorithm generating schedules in which real-time tasks with precedence constraints can tolerate one processor's permanent failure in a heterogeneous system with fully connected network. The tasks are assumed to be non-preemptable, and each task has two copies scheduled on different processors and mutually excluded in time. In the literature in recent years, the quality of a schedule has been previously improved by allowing a backup copy to overlap with other backup copies on the same processor. However, this approach assumes that tasks are independent of one other. To meet the needs of real-time systems where tasks have precedence constraints, a new overlapping scheme is proposed. We show that, given two tasks, the necessary conditions for their backup copies to safely overlap in time with each other are (1) their corresponding primary copies are scheduled on two different processors, (2) they are independent tasks, and (3) the execution of their backup copies implies the failures of the processors on which their primary copies are scheduled. For tasks with precedence constraints, the new overlapping scheme allows the backup copy of a task to overlap with its successors' primary copies, thereby further reducing schedule length. Based on a proposed reliability model, tasks are judiciously allocated to processors so as to maximize the reliability of heterogeneous systems. Additionally, times for detecting and handling of a permanent fault are incorporated into the scheduling scheme. We have performed experiments using synthetic workloads as well as a real world application. Simulation results show that compared with existing scheduling algorithms in the literature, our scheduling algorithm improves reliability by up to 22.4% (with an average of 16.4%) and achieves an improvement in performability, a measure that combines reliability and schedulability, by up to 421.9% (with an average of 49.3%).