Generating sound workflow views for correct provenance analysis

  • Authors:
  • Ziyang Liu;Susan B. Davidson;Yi Chen

  • Affiliations:
  • Arizona State University;University of Pennsylvania;Arizona State University

  • Venue:
  • ACM Transactions on Database Systems (TODS)
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Workflow views abstract groups of tasks in a workflow into high level composite tasks, in order to reuse subworkflows and facilitate provenance analysis. However, unless a view is carefully designed, it may not preserve the dataflow between tasks in the workflow, that is, it may not be sound. Unsound views can be misleading and cause incorrect provenance analysis. This article studies the problem of efficiently identifying and correcting unsound workflow views with minimal changes, and constructing minimal sound and elucidative workflow views with a set of user-specified relevant tasks. In particular, two related problems are investigated. First, given a workflow view, we wish to split each unsound composite task into the minimal number of tasks, such that the resulting view is sound. Second, given a workflow and a set of user specified relevant tasks, we generate a sound view, such that each composite task contains at most one relevant task, and the total number of tasks is minimized. We prove that both problems are NP-hard by reduction from independent set. We then propose two local optimality conditions (weak and strong) for each problem, and design polynomial time algorithms for both problems to meet these conditions. Experiments show that our proposed algorithms are reasonably effective and efficient. The proposed techniques are useful for view analysis/construction for not only workflows, but general networks as well.