Workflow Scheduling to Minimize Data Movement Using Multi-constraint Graph Partitioning

  • Authors:
  • Masahiro Tanaka;Osamu Tatebe

  • Affiliations:
  • -;-

  • Venue:
  • CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Among scheduling algorithms of scientific workflows, the graph partitioning is a technique to minimize data transfer between nodes or clusters. However, when the graph partitioning is simply applied to a complex workflow DAG, tasks in each parallel phase are not always evenly assigned to computation nodes since the graph partitioning algorithm is not aware of edge directions that represent task dependencies. Thus, we propose a new method of task assignment based on Multi-Constraint Graph Partitioning. This method relates the dimension of weight vectors to the rank of a task phase defined by traversing the task graph. Our algorithm is implemented in the Pwrake workflow system and evaluated the performance of the Montage workflow using a computer cluster. The result shows that the file size accessed from remote nodes is reduced from 88% to 14% of the total file size accessed during the workflow and that the elapsed time is reduced by 31%.