A navigation model for exploring scientific workflow provenance graphs

  • Authors:
  • Manish Kumar Anand;Shawn Bowers;Bertram Ludäscher

  • Affiliations:
  • University of California, Davis;Gonzaga University;University of California, Davis

  • Venue:
  • Proceedings of the 4th Workshop on Workflows in Support of Large-Scale Science
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many scientific workflow systems record provenance information in the form of data and process dependencies as part of workflow execution. Users often wish to explore these dependencies to reproduce, validate, and explain workflow results, e.g., by examining the data and processes that were used to produce particular workflow outputs. A natural interface for determining relevant provenance information, which is adopted by many systems, is to display the complete provenance dependency graph. However, for many workflows, provenance graphs can be large, with thousands or more nodes and edges. Displaying an entire provenance graph for such workflows can result in "provenance overload," where the large amount of provenance information available makes it difficult for users to find relevant information and explore data and process dependencies. In this paper, we address the challenges of "provenance overload" through a novel navigation model that provides operations for creating different views of provenance graphs along with approaches for easily navigating between different views. Further, our proposed navigation model provides an integrated approach for exploring, summarizing, and querying portions of provenance graphs. We also discuss different architectures for efficiently navigating large provenance graphs against an underlying provenance database.