A navigation model for exploring scientific workflow provenance graphs

Authors:
Manish Kumar Anand;Shawn Bowers;Bertram Ludäscher
Affiliations:
University of California, Davis;Gonzaga University;University of California, Davis
Venue:
Proceedings of the 4th Workshop on Workflows in Support of Large-Scale Science
Year:
2009

Citing 22
Cited 4

PESTO: An Integrated Query/Browser for Object Databases

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
BBQ: A Visual Interface for Integrated Browsing and Querying of XML

VDB 5 Proceedings of the Fifth Working Conference on Visual Database Systems: Advances in Visual Information Management
A survey of data provenance in e-science

ACM SIGMOD Record
VisTrails: visualization meets data management

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Taverna: lessons in creating a workflow environment for the life sciences: Research Articles

Concurrency and Computation: Practice & Experience - Workflow in Grid Systems
Scientific workflow management and the Kepler system: Research Articles

Concurrency and Computation: Practice & Experience - Workflow in Grid Systems
Navigating Provenance Information for Distributed Healthcare Management

WI '06 Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence
Provenance Explorer-a graphical interface for constructing scientific publication packages from provenance trails

International Journal on Digital Libraries
Provenance for Visualizations: Reproducibility and Beyond

Computing in Science and Engineering
Mining Taverna's semantic web of provenance

Concurrency and Computation: Practice & Experience - The First Provenance Challenge
Special Issue: The First Provenance Challenge

Concurrency and Computation: Practice & Experience - The First Provenance Challenge
Automatic capture and efficient storage of e-Science experiment provenance

Concurrency and Computation: Practice & Experience - The First Provenance Challenge
Tackling the Provenance Challenge one layer at a time

Concurrency and Computation: Practice & Experience - The First Provenance Challenge
Efficient provenance storage

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Efficient lineage tracking for scientific workflows

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Provenance and scientific workflows: challenges and opportunities

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Scientific workflow design for mere mortals

Future Generation Computer Systems
Efficient provenance storage over nested data collections

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Querying and Managing Provenance through User Views in Scientific Workflows

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Layering in provenance systems

USENIX'09 Proceedings of the 2009 conference on USENIX Annual technical conference
Towards a model of provenance and user views in scientific workflows

DILS'06 Proceedings of the Third international conference on Data Integration in the Life Sciences
A model for user-oriented data provenance in pipelined scientific workflows

IPAW'06 Proceedings of the 2006 international conference on Provenance and Annotation of Data

PROPUB: towards a declarative approach for publishing customized, policy-aware provenance

SSDBM'11 Proceedings of the 23rd international conference on Scientific and statistical database management
Database support for exploring scientific workflow provenance graphs

SSDBM'12 Proceedings of the 24th international conference on Scientific and Statistical Database Management
Supporting undo and redo in scientific data analysis

TaPP'13 Proceedings of the 5th USENIX conference on Theory and Practice of Provenance
Supporting undo and redo in scientific data analysis

Proceedings of the 5th USENIX Workshop on the Theory and Practice of Provenance

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many scientific workflow systems record provenance information in the form of data and process dependencies as part of workflow execution. Users often wish to explore these dependencies to reproduce, validate, and explain workflow results, e.g., by examining the data and processes that were used to produce particular workflow outputs. A natural interface for determining relevant provenance information, which is adopted by many systems, is to display the complete provenance dependency graph. However, for many workflows, provenance graphs can be large, with thousands or more nodes and edges. Displaying an entire provenance graph for such workflows can result in "provenance overload," where the large amount of provenance information available makes it difficult for users to find relevant information and explore data and process dependencies. In this paper, we address the challenges of "provenance overload" through a novel navigation model that provides operations for creating different views of provenance graphs along with approaches for easily navigating between different views. Further, our proposed navigation model provides an integrated approach for exploring, summarizing, and querying portions of provenance graphs. We also discuss different architectures for efficiently navigating large provenance graphs against an underlying provenance database.