Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Supporting Fine-grained Data Lineage in a Database Visualization Environment
ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
Lineage Tracing for General Data Warehouse Transformations
Proceedings of the 27th International Conference on Very Large Data Bases
Chimera: AVirtual Data System for Representing, Querying, and Automating Data Derivation
SSDBM '02 Proceedings of the 14th International Conference on Scientific and Statistical Database Management
Data Provenance: Some Basic Issues
FST TCS 2000 Proceedings of the 20th Conference on Foundations of Software Technology and Theoretical Computer Science
Practical Lineage Tracing in Data Warehouses
ICDE '00 Proceedings of the 16th International Conference on Data Engineering
The Anatomy of the Grid: Enabling Scalable Virtual Organizations
International Journal of High Performance Computing Applications
PrIMe: A methodology for developing provenance-aware applications
ACM Transactions on Software Engineering and Methodology (TOSEM)
Applying the virtual data provenance model
IPAW'06 Proceedings of the 2006 international conference on Provenance and Annotation of Data
Hi-index | 0.00 |
Large-scale e-Science experiments present unprecedented data handling requirements with their multi-petabyte data storages. Complex software applications, such as the ATLAS High Energy Physics experiment at CERN, run throughout Grid computing sites around the world in a distributed environment, with scientists performing concurrent analysis on data and producing new data products shared among the collaboration. In this paper, we introduce a multi-phase infrastructure to achieve data provenance for an e-Science experiment. We propose an infrastructure to integrate provenance onto an existing legacy application with strong emphasis on scalability and explore the relationship between provenance and metadata introducing a model where data provenance is made available as metadata through a separate reasoning phase.