Chimera: AVirtual Data System for Representing, Querying, and Automating Data Derivation
SSDBM '02 Proceedings of the 14th International Conference on Scientific and Statistical Database Management
Earth System Science Workbench: A Data Management Infrastructure for Earth Science Products
SSDBM '01 Proceedings of the 13th International Conference on Scientific and Statistical Database Management
Lineage retrieval for scientific data processing: a survey
ACM Computing Surveys (CSUR)
VisTrails: visualization meets data management
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Bigtable: a distributed storage system for structured data
OSDI '06 Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7
Provenance-aware storage systems
ATEC '06 Proceedings of the annual conference on USENIX '06 Annual Technical Conference
Provenance for Visualizations: Reproducibility and Beyond
Computing in Science and Engineering
An annotation management system for relational databases
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Connecting Scientific Data to Scientific Experiments with Provenance
E-SCIENCE '07 Proceedings of the Third IEEE International Conference on e-Science and Grid Computing
Automatic capture and reconstruction of computational provenance
Concurrency and Computation: Practice & Experience - The First Provenance Challenge
Efficient lineage tracking for scientific workflows
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Steps toward managing lineage metadata in grid clusters
TAPP'09 First workshop on on Theory and practice of provenance
Making a cloud provenance-aware
TAPP'09 First workshop on on Theory and practice of provenance
Perm: Processing Provenance and Data on the Same Data Model through Query Rewriting
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
On the Efficiency of Provenance Queries
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Performance and extension of user space file systems
Proceedings of the 2010 ACM Symposium on Applied Computing
Efficient querying and maintenance of network provenance at internet-scale
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Distributed Storage and Querying Techniques for a Semantic Web of Scientific Workflow Provenance
SCC '10 Proceedings of the 2010 IEEE International Conference on Services Computing
Mendel: efficiently verifying the lineage of data modified in multiple trust domains
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
FAST'10 Proceedings of the 8th USENIX conference on File and storage technologies
Layering in provenance systems
USENIX'09 Proceedings of the 2009 conference on USENIX Annual technical conference
The Open Provenance Model core specification (v1.1)
Future Generation Computer Systems
Representing distributed systems using the Open Provenance Model
Future Generation Computer Systems
Policy-Based Integration of Provenance Metadata
POLICY '11 Proceedings of the 2011 IEEE International Symposium on Policies for Distributed Systems and Networks
Contextualised workflow execution in mygrid
EGC'05 Proceedings of the 2005 European conference on Advances in Grid Computing
A general-purpose provenance library
TaPP'12 Proceedings of the 4th USENIX conference on Theory and Practice of Provenance
Towards automated collection of application-level data provenance
TaPP'12 Proceedings of the 4th USENIX conference on Theory and Practice of Provenance
Proceedings of the Joint EDBT/ICDT 2013 Workshops
Android provenance: diagnosing device disorders
TaPP'13 Proceedings of the 5th USENIX conference on Theory and Practice of Provenance
Declaratively processing provenance metadata
TaPP'13 Proceedings of the 5th USENIX conference on Theory and Practice of Provenance
Android provenance: diagnosing device disorders
Proceedings of the 5th USENIX Workshop on the Theory and Practice of Provenance
Declaratively processing provenance metadata
Proceedings of the 5th USENIX Workshop on the Theory and Practice of Provenance
Hi-index | 0.00 |
SPADE is an open source software infrastructure for data provenance collection and management. The underlying data model used throughout the system is graph-based, consisting of vertices and directed edges that are modeled after the node and relationship types described in the Open Provenance Model. The system has been designed to decouple the collection, storage, and querying of provenance metadata. At its core is a novel provenance kernel that mediates between the producers and consumers of provenance information, and handles the persistent storage of records. It operates as a service, peering with remote instances to enable distributed provenance queries. The provenance kernel on each host handles the buffering, filtering, and multiplexing of incoming metadata from multiple sources, including the operating system, applications, and manual curation. Provenance elements can be located locally with queries that use wildcard, fuzzy, proximity, range, and Boolean operators. Ancestor and descendant queries are transparently propagated across hosts until a terminating expression is satisfied, while distributed path queries are accelerated with provenance sketches.