Supporting Fine-grained Data Lineage in a Database Visualization Environment
ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
Advances in dataflow programming languages
ACM Computing Surveys (CSUR)
A survey of data provenance in e-science
ACM SIGMOD Record
VisTrails: visualization meets data management
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Taverna: lessons in creating a workflow environment for the life sciences: Research Articles
Concurrency and Computation: Practice & Experience - Workflow in Grid Systems
Scientific workflow management and the Kepler system: Research Articles
Concurrency and Computation: Practice & Experience - Workflow in Grid Systems
Provenance-aware storage systems
ATEC '06 Proceedings of the annual conference on USENIX '06 Annual Technical Conference
GridDB: a data-centric overlay for scientific grids
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Zoom*UserViews: querying relevant provenance in workflow systems
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Taverna Workflows: Syntax and Semantics
E-SCIENCE '07 Proceedings of the Third IEEE International Conference on e-Science and Grid Computing
Databases with uncertainty and lineage
The VLDB Journal — The International Journal on Very Large Data Bases
Provenance in collection-oriented scientific workflows
Concurrency and Computation: Practice & Experience - The First Provenance Challenge
Automatic capture and efficient storage of e-Science experiment provenance
Concurrency and Computation: Practice & Experience - The First Provenance Challenge
Tackling the Provenance Challenge one layer at a time
Concurrency and Computation: Practice & Experience - The First Provenance Challenge
Graphs-at-a-time: query language and access methods for graph databases
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Efficient lineage tracking for scientific workflows
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Querying and re-using workflows with VsTrails
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Scientific workflow design for mere mortals
Future Generation Computer Systems
Efficient provenance storage over nested data collections
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Querying and Managing Provenance through User Views in Scientific Workflows
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Differencing Provenance in Scientific Workflows
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Scientific workflow: a survey and research directions
PPAM'07 Proceedings of the 7th international conference on Parallel processing and applied mathematics
Collection-Oriented scientific workflows for integrating and analyzing biological data
DILS'06 Proceedings of the Third international conference on Data Integration in the Life Sciences
Actor-oriented design of scientific workflows
ER'05 Proceedings of the 24th international conference on Conceptual Modeling
Issues in automatic provenance collection
IPAW'06 Proceedings of the 2006 international conference on Provenance and Annotation of Data
SSDBM'10 Proceedings of the 22nd international conference on Scientific and statistical database management
Future Generation Computer Systems
Workflows to open provenance graphs, round-trip
Future Generation Computer Systems
Search, adapt, and reuse: the future of scientific workflows
ACM SIGMOD Record
Putting lipstick on pig: enabling database-style workflow provenance
Proceedings of the VLDB Endowment
Achieving reproducibility by combining provenance with service and workflow versioning
Proceedings of the 6th workshop on Workflows in support of large-scale science
Database support for exploring scientific workflow provenance graphs
SSDBM'12 Proceedings of the 24th international conference on Scientific and Statistical Database Management
IPAW'12 Proceedings of the 4th international conference on Provenance and Annotation of Data and Processes
Detecting duplicate records in scientific workflow results
IPAW'12 Proceedings of the 4th international conference on Provenance and Annotation of Data and Processes
Modelling provenance using structured occurrence networks
IPAW'12 Proceedings of the 4th international conference on Provenance and Annotation of Data and Processes
WebLab PROV: computing fine-grained provenance links for XML artifacts
Proceedings of the Joint EDBT/ICDT 2013 Workshops
Provenance for seismological processing pipelines in a distributed streaming workflow
Proceedings of the Joint EDBT/ICDT 2013 Workshops
Editorial: OPQL: Querying scientific workflow provenance at the graph level
Data & Knowledge Engineering
Hi-index | 0.00 |
The management and querying of workflow provenance data underpins a collection of activities, including the analysis of workflow results, and the debugging of workflows or services. Such activities require efficient evaluation of lineage queries over potentially complex and voluminous provenance logs. Näive implementations of lineage queries navigate provenance logs by joining tables that represent the flow of data between connected processors invoked from workflows. In this paper we provide an approach to provenance querying that: (i) avoids joins over provenance logs by using information about the workflow definition to inform the construction of queries that directly target relevant lineage results; (ii) provides fine grained provenance querying, even for workflows that create and consume collections; and (iii) scales effectively to address complex workflows, workflows with large intermediate data sets, and queries over multiple workflows.