Imbalance optimization in scientific workflows

  • Authors:
  • Weiwei Chen;Ewa Deelman;Rizos Sakellariou

  • Affiliations:
  • University of Southern California, Marina del Rey, CA, USA;University of Southern California, Marina del Rey, CA, USA;University of Manchester, Manchester, United Kingdom

  • Venue:
  • Proceedings of the 27th international ACM conference on International conference on supercomputing
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Scientific workflows are a means of defining and orchestrating large, complex, multi-stage computations that perform data analysis and/or simulation. Task clustering is a runtime optimization technique that merges multiple short workflow tasks into a single job such that the job execution overhead is reduced and the overall runtime performance of the workflow is significantly improved. However, current task clustering strategies fail to consider the imbalance problem of both task runtime and task dependency. In our work, we first investigate the different causes of runtime imbalance and dependency imbalance. We then introduce a series of metrics based on our prior work to measure the severity of runtime and dependency imbalance respectively. Finally, we study a wide range of real scientific workflows to generalize the relationship between these metrics and balancing methods.