Unifying temporal data models via a conceptual model
Information Systems
Pig latin: a not-so-foreign language for data processing
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Provenance and scientific workflows: challenges and opportunities
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Provenance for Computational Tasks: A Survey
Computing in Science and Engineering
Guest Editors' Introduction: Reproducible Research
Computing in Science and Engineering
Provenance in Databases: Why, How, and Where
Foundations and Trends in Databases
A graph model of data and workflow provenance
TAPP'10 Proceedings of the 2nd conference on Theory and practice of provenance
Facilitating fine grained data provenance using temporal data model
Proceedings of the Seventh International Workshop on Data Management for Sensor Networks
Bridging workflow and data provenance using strong links
SSDBM'10 Proceedings of the 22nd international conference on Scientific and statistical database management
Putting lipstick on pig: enabling database-style workflow provenance
Proceedings of the VLDB Endowment
WebLab PROV: computing fine-grained provenance links for XML artifacts
Proceedings of the Joint EDBT/ICDT 2013 Workshops
Hi-index | 0.00 |
While there has been substantial work on both database and workflow provenance, the two problems have only been examined in isolation. It is widely accepted that the existing models are incompatible. Database provenance is fine-grained and captures changes to tuples in a database. In contrast, workflow provenance is represented at a coarser level and reflects the functional model of workflow systems, which is stateless--each computational step derives a new artifact. In this paper, we propose a new approach to combine database and workflow provenance. We address the mismatch between the different kinds of provenance by using a temporal model which explicitly represents the database states as updates are applied. We discuss how, under this model, reproducibility is obtained for workflows that manipulate databases, and how different queries that straddle the two provenance traces can be evaluated. We also describe a proof-of-concept implementation that integrates a workflow system and a commercial relational database.