An annotation management system for relational databases
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Provenance trails in the Wings-Pegasus system
Concurrency and Computation: Practice & Experience - The First Provenance Challenge
Tackling the Provenance Challenge one layer at a time
Concurrency and Computation: Practice & Experience - The First Provenance Challenge
Provenance and scientific workflows: challenges and opportunities
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Got data?: a guide to data preservation in the information age
Communications of the ACM - Surviving the data deluge
Workflows and e-Science: An overview of workflow system features and capabilities
Future Generation Computer Systems
Querying and Managing Provenance through User Views in Scientific Workflows
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Provenance in Databases: Why, How, and Where
Foundations and Trends in Databases
SSDBM'10 Proceedings of the 22nd international conference on Scientific and statistical database management
PROPUB: towards a declarative approach for publishing customized, policy-aware provenance
SSDBM'11 Proceedings of the 23rd international conference on Scientific and statistical database management
Putting lipstick on pig: enabling database-style workflow provenance
Proceedings of the VLDB Endowment
HELIO: Discovery and Analysis of Data in Heliophysics
ESCIENCE '11 Proceedings of the 2011 IEEE Seventh International Conference on eScience
Common motifs in scientific workflows: An empirical analysis
E-SCIENCE '12 Proceedings of the 2012 IEEE 8th International Conference on E-Science (e-Science)
Static compiler analysis for workflow provenance
WORKS '13 Proceedings of the 8th Workshop on Workflows in Support of Large-Scale Science
Hi-index | 0.00 |
Many scientists are using workflows to systematically design and run computational experiments. Once the workflow is executed, the scientist may want to publish the dataset generated as a result, to be, e.g., reused by other scientists as input to their experiments. In doing so, the scientist needs to curate such dataset by specifying metadata information that describes it, e.g. its derivation history, origins and ownership. To assist the scientist in this task, we explore in this paper the use of provenance traces collected by workflow management systems when enacting workflows. Specifically, we identify the shortcomings of such raw provenance traces in supporting the data publishing task, and propose an approach whereby distilled, yet more informative, provenance traces that are fit for the data publishing task can be derived.