WORKEM: Representing and Emulating Distributed Scientific Workflow Execution State

  • Authors:
  • Lavanya Ramakrishnan;Dennis Gannon;Beth Plale

  • Affiliations:
  • -;-;-

  • Venue:
  • CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Scientific workflows have become an integral part of cyber infrastructure as their computational complexity and data sizes have grown. However, the complexity of the distributed infrastructure makes design of new workflows, determining the right management policies, debugging, testing or reproduction of errors challenging. Today, workflow engines manage the dependencies between tasks of workflows and there are tools available to wrap scientific codes. There is a need for a customizable, isolated and manageable testing container for design, evaluation and deployment of distributed workflows. To build such an environment, we need to be able to model and represent, capture and possibly reuse the execution flows within each task of a workflow that accurately captures the execution behavior. In this paper, we present the design and implementation of WORKEM, an extensible framework that can be used to represent and emulate workflow execution state. We also detail the use of the framework in two specific case studies (a) design and testing of an orchestration system (b) generation of a provenance database. Our evaluation shows that the framework has minimal overheads and can be scaled to run hundreds of workflows in short durations of time and with a high amount of parallelism.