Don't take my folders away!: organizing personal information to get ghings done
CHI '05 Extended Abstracts on Human Factors in Computing Systems
Distributed computing in practice: the Condor experience: Research Articles
Concurrency and Computation: Practice & Experience - Grid Performance
Managing the Evolution of Dataflows with VisTrails
ICDEW '06 Proceedings of the 22nd International Conference on Data Engineering Workshops
Scientific workflow management and the Kepler system: Research Articles
Concurrency and Computation: Practice & Experience - Workflow in Grid Systems
Programming scientific and distributed workflow with Triana services: Research Articles
Concurrency and Computation: Practice & Experience - Workflow in Grid Systems
Nested mappings: schema mapping reloaded
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Provenance in collection-oriented scientific workflows
Concurrency and Computation: Practice & Experience - The First Provenance Challenge
Collection-Oriented scientific workflows for integrating and analyzing biological data
DILS'06 Proceedings of the Third international conference on Data Integration in the Life Sciences
Automatic generation of workflow provenance
IPAW'06 Proceedings of the 2006 international conference on Provenance and Annotation of Data
A model for user-oriented data provenance in pipelined scientific workflows
IPAW'06 Proceedings of the 2006 international conference on Provenance and Annotation of Data
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Kepler/pPOD: Scientific Workflow and Provenance Support for Assembling the Tree of Life
Provenance and Annotation of Data and Processes
Provenance and the Price of Identity
Provenance and Annotation of Data and Processes
Efficient provenance storage over nested data collections
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Scientific Workflows: Business as Usual?
BPM '09 Proceedings of the 7th International Conference on Business Process Management
Understanding provenance black boxes
Distributed and Parallel Databases
Towards query interoperability: PASSing PLUS
TAPP'10 Proceedings of the 2nd conference on Theory and practice of provenance
The Foundations for Provenance on the Web
Foundations and Trends in Web Science
Hi-index | 0.00 |
While a number of scientific workflow systems support data provenance, they primarily focus on collecting and querying provenance for single workflow runs. Scientific research projects, however, typically involve (1) many interrelated workflows (where data from one or more workflow runs are selected and used as input to subsequent runs) and (2) tasks between workflow runs that cannot be fully automated. This paper addresses the need for recording data dependencies across multiple workflow runs and accommodating data management activities performed between runs. We define a new conceptual model for representing project-level provenance based on the notion of project histories and folders, and describe mechanisms to support this model in the collection-oriented modeling and design framework of KEPLER. Our approach allows users to conveniently organize their projects and data using the familiar folder-hierarchy metaphor, while at the same time integrating this information with detailed provenance of data products generated via automated scientific workflows.