Reliability and Scheduling on Systems Subject to Failures

  • Authors:
  • Mourad Hakem;Franck Butelle

  • Affiliations:
  • Universite Paris Nord, France;Universite Paris Nord, France

  • Venue:
  • ICPP '07 Proceedings of the 2007 International Conference on Parallel Processing
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a new bi-objective greedy heuristic for scheduling parallel applications on heterogeneous distributed computing systems. The proposed algorithm which is called BSA (Bi-objective Scheduling Algorithm) takes into account not only the time makespan but also the failure probability of the application. Since it is not usually possible to achieve the two conflicting objectives (performance and reliability) simultaneously, a bi-objective compromise function is introduced. BSA has a low time complexity of O(e|P|+v log \omega), where e and v are respectively the number of edges and tasks in the task graph of the application. |P| is the number of machines (processors) in the system and \omega is the width of the task graph. Experimental results show the performance of the proposed algorithm.