Using Explicit Control Processes in Distributed Workflows to Gather Provenance

  • Authors:
  • Sérgio Manuel Cruz;Fernando Seabra Chirigati;Rafael Dahis;Maria Luiza Campos;Marta Mattoso

  • Affiliations:
  • PESC - COPPE,;PESC - COPPE,;PESC - COPPE,;PPGI - IM/NCE, Federal University of Rio de Janeiro (UFRJ), Rio de Janeiro, Brazil 21941-972;PESC - COPPE,

  • Venue:
  • Provenance and Annotation of Data and Processes
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Distributing workflow tasks among high performance environments involves local processing and remote execution on clusters and grids. This dis-tribution often needs interoperation between heterogeneous workflow definition languages and their corresponding execution machines. A centralized Workflow Management System (WfMS) can be locally controlling the execution of a workflow that needs a grid WfMS to execute a sub-workflow that requires high performance. Workflow specification languages often provide different control-flow execution structures. Moving from one environment to another requires mappings between these languages. Due to heterogeneity, control-flow structures, available in one system, may not be supported in another. In these heterogeneous distributed environments, provenance gathering becomes also heterogeneous. This work presents control-flow modules that aim to be independent from WfMS. By inserting these control-flow modules on the workflow specification, the workflow execution control becomes less dependent of heterogeneous workflow execution engines. In addition, they can be used to gather provenance data both from local and remote execution, thus allowing the same provenance registration on both environments independent of the heterogeneous WfMS. The proposed modules extend the ordinary workflow tasks by providing dynamic behavioral execution control. They were implemented in the VisTrails graphical workflow enactment engine, which offers a flexible infrastructure for provenance gathering.