Supporting Fine-grained Data Lineage in a Database Visualization Environment
ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
Data Triage: An Adaptive Architecture for Load Shedding in TelegraphCQ
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Revision Processing in a Stream Processing Engine: A High-Level Design
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
ULDBs: databases with uncertainty and lineage
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Load shedding in a data stream manager
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
The ORCHESTRA Collaborative Data Sharing System
ACM SIGMOD Record
Advances and Challenges for Scalable Provenance in Stream Processing Systems
Provenance and Annotation of Data and Processes
Efficient provenance storage over nested data collections
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Perm: Processing Provenance and Data on the Same Data Model through Query Rewriting
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Provenance in Databases: Why, How, and Where
Foundations and Trends in Databases
Microsoft CEP server and online behavioral targeting
Proceedings of the VLDB Endowment
A graph model of data and workflow provenance
TAPP'10 Proceedings of the 2nd conference on Theory and practice of provenance
Facilitating fine grained data provenance using temporal data model
Proceedings of the Seventh International Workshop on Data Management for Sensor Networks
Visual debugging for stream processing applications
RV'10 Proceedings of the First international conference on Runtime verification
Proceedings of the thirtieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Putting lipstick on pig: enabling database-style workflow provenance
Proceedings of the VLDB Endowment
Adaptive Inference of Fine-grained Data Provenance to Achieve High Accuracy at Lower Storage Costs
ESCIENCE '11 Proceedings of the 2011 IEEE Seventh International Conference on eScience
Towards low overhead provenance tracking in near real-time stream filtering
IPAW'06 Proceedings of the 2006 international conference on Provenance and Annotation of Data
Hi-index | 0.00 |
Managing fine-grained provenance is a critical requirement for data stream management systems (DSMS), not only to address complex applications that require diagnostic capabilities and assurance, but also for providing advanced functionality such as revision processing or query debugging. This paper introduces a novel approach that uses operator instrumentation, i.e., modifying the behavior of operators, to generate and propagate fine-grained provenance through several operators of a query network. In addition to applying this technique to compute provenance eagerly during query execution, we also study how to decouple provenance computation from query processing to reduce run-time overhead and avoid unnecessary provenance retrieval. This includes computing a concise superset of the provenance to allow lazily replaying a query network and reconstruct its provenance as well as lazy retrieval to avoid unnecessary reconstruction of provenance. We develop stream-specific compression methods to reduce the computational and storage overhead of provenance generation and retrieval. Ariadne, our provenance-aware extension of the Borealis DSMS implements these techniques. Our experiments confirm that Ariadne manages provenance with minor overhead and clearly outperforms query rewrite, the current state-of-the-art.