Scheduling of scientific workflow in non-dedicated heterogeneous multicluster platform

  • Authors:
  • Jinghui Zhang;Junzhou Luo;Fang Dong

  • Affiliations:
  • School of Computer Science and Engineering, Southeast University, Nanjing, PR China;School of Computer Science and Engineering, Southeast University, Nanjing, PR China;School of Computer Science and Engineering, Southeast University, Nanjing, PR China

  • Venue:
  • Journal of Systems and Software
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many scientific workflows can be structured as Parallel Task Graphs (PTGs), that is, graphs of data-parallel tasks. Adding data parallelism to a workflow provides opportunities for higher performance and scalability. Workflow tasks are data-parallel and moldable, and clusters are not only heterogeneous but also non-dedicated for workflow execution. Therefore, scheduling such scientific workflow in a multicluster platform becomes a challenging task. To address this problem, we study the scheduling of scientific workflow in a non-dedicated heterogeneous multicluster platform aimed at minimizing the makespan for workflow execution. In this paper, three scheduling algorithms for effective workflow task mapping and resource allocation are proposed, among them MHEFT-RSV and MHEFT-RSV-BD are heuristic algorithms. An exact branch-and-cut scheduling algorithm is implemented, which exploits the intertask precedence and resource constraints thereby accelerating the process of obtaining a feasible schedule with minimized makespan. Detailed simulation experiments show that on average the exact branch-and-cut algorithm obtains shorter makespan for small and medium size workflows, while MHEFT-RSV and MHEFT-RSV-BD achieves better tradeoff between makespan and computation time for large scientific workflows.