The grid: blueprint for a new computing infrastructure
The grid: blueprint for a new computing infrastructure
Deterministic Processor Scheduling
ACM Computing Surveys (CSUR)
Faults in Grids: Why are they so bad and What can be done about it?
GRID '03 Proceedings of the 4th International Workshop on Grid Computing
QoS Support for Time-Critical Grid Workflow Applications
E-SCIENCE '05 Proceedings of the First International Conference on e-Science and Grid Computing
Performance evaluation of fault tolerance techniques in grid computing system
Computers and Electrical Engineering
Hi-index | 0.00 |
Fault-tolerant scheduling is an imperative step for large-scale computational Grid systems, as often geographically distributed nodes co-operate to execute a task. By and large, the primary-backup approach is a common methodology used for fault tolerance where in each task has a primary copy and a backup copy on two different processors. Backup overloading has been proposed to reduce replication cost by allowing the backup copy to overload with other backup copies on the same processor. In this paper, we consider two classes of independent tasks where in both the classes have fault-tolerance requirements. Furthermore, Class 1 tasks require the response time to be as short as possible when a fault occurs, while Class 2 tasks prefer backups with minimum replication cost. We propose two algorithms, called the MRC-ECT algorithm and the MCT-LRC algorithm. Algorithm MRC-ECT is shown to guarantee an optimal backup schedule in terms of replication cost, while MCT-LRCcan schedule a backup with minimum completion time and low replication cost. We conduct extensive simulation experiments to quantify the performance of the proposed algorithms.