Detecting common scientific workflow fragments using templates and execution provenance

  • Authors:
  • Daniel Garijo;Oscar Corcho;Yolanda Gil

  • Affiliations:
  • Universidad Politécnica de Madrid, Madrid, Spain;Universidad Politécnica de Madrid, Madrid, Spain;University of Southern California, Los Angeles, USA

  • Venue:
  • Proceedings of the seventh international conference on Knowledge capture
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Provenance plays a major role when understanding and reusing the methods applied in a scientific experiment, as it provides a record of inputs, the processes carried out and the use and generation of intermediate and final results. In the specific case of in-silico scientific experiments, a large variety of scientific workflow systems (e.g., Wings, Taverna, Galaxy, Vistrails) have been created to support scientists. All of these systems produce some sort of provenance about the executions of the workflows that encode scientific experiments. However, provenance is normally recorded at a very low level of detail, which complicates the understanding of what happened during execution. In this paper we propose an approach to automatically obtain abstractions from low-level provenance data by finding common workflow fragments on workflow execution provenance and relating them to templates. We have tested our approach with a dataset of workflows published by the Wings workflow system. Our results show that by using these kinds of abstractions we can highlight the most common abstract methods used in the executions of a repository, relating different runs and workflow templates with each other.