A survey of data provenance in e-science
ACM SIGMOD Record
Scientific workflow management and the Kepler system: Research Articles
Concurrency and Computation: Practice & Experience - Workflow in Grid Systems
From computation models to models of provenance: the RWS approach
Concurrency and Computation: Practice & Experience - The First Provenance Challenge
Provenance and scientific workflows: challenges and opportunities
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Kepler/pPOD: Scientific Workflow and Provenance Support for Assembling the Tree of Life
Provenance and Annotation of Data and Processes
Advances and Challenges for Scalable Provenance in Stream Processing Systems
Provenance and Annotation of Data and Processes
Scientific workflow design for mere mortals
Future Generation Computer Systems
Efficient provenance storage over nested data collections
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Fine-grained and efficient lineage querying of collection-based workflow provenance
Proceedings of the 13th International Conference on Extending Database Technology
Prospective and Retrospective Provenance Collection in Scientific Workflow Environments
SCC '10 Proceedings of the 2010 IEEE International Conference on Services Computing
SSDBM'10 Proceedings of the 22nd international conference on Scientific and statistical database management
The Open Provenance Model core specification (v1.1)
Future Generation Computer Systems
Putting lipstick on pig: enabling database-style workflow provenance
Proceedings of the VLDB Endowment
Provenance collection support in the kepler scientific workflow system
IPAW'06 Proceedings of the 2006 international conference on Provenance and Annotation of Data
WebLab PROV: computing fine-grained provenance links for XML artifacts
Proceedings of the Joint EDBT/ICDT 2013 Workshops
Declaratively processing provenance metadata
TaPP'13 Proceedings of the 5th USENIX conference on Theory and Practice of Provenance
Declaratively processing provenance metadata
Proceedings of the 5th USENIX Workshop on the Theory and Practice of Provenance
On assisting scientific data curation in collection-based dataflows using labels
WORKS '13 Proceedings of the 8th Workshop on Workflows in Support of Large-Scale Science
Static compiler analysis for workflow provenance
WORKS '13 Proceedings of the 8th Workshop on Workflows in Support of Large-Scale Science
Hi-index | 0.00 |
Fine-grained dependencies within scientific workflow provenance specify lineage relationships between a workflow result and the input data, intermediate data, and computation steps used in the result's derivation. This information is often needed to determine the quality and validity of scientific data, and as such, plays a key role in both provenance standardization efforts and provenance query frameworks. While most scientific workflow systems can record basic information concerning the execution of a workflow, they typically fall into one of three categories with respect to recording dependencies: (1) they rely on workflow computation steps to declare dependency relationships at runtime; (2) they impose implicit assumptions concerning dependency patterns from which dependencies are automatically inferred; or (3) they do not assert any dependency information at all. We present an alternative approach that decouples dependency inference from workflow systems and underlying execution traces. In particular, we present a high-level declarative language for expressing explicit dependency rules that can be applied (at any time) to workflow trace events to generate fine-grained dependency information. This approach not only makes provenance dependency rules explicit, but allows rules to be specified and refined by different users as needed. We present our dependency rule language and implementation that rewrites dependency rules into relational queries over underlying workflow traces. We also demonstrate the language using common types of dependency patterns found within scientific workflows.