On provenance minimization

Authors:
Yael Amsterdamer;Daniel Deutch;Tova Milo;Val Tannen
Affiliations:
Tel Aviv University and University of Pennsylvania, Tel Aviv, Israel;Ben Gurion University and University of Pennsylvania, Be'er Sheva, Israel;Tel Aviv University, Tel Aviv, Israel;University of Pennsylvania, Philadelphia, PA, USA
Venue:
Proceedings of the thirtieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Year:
2011

Citing 24
Cited 4

On conjunctive queries containing inequalities

Journal of the ACM (JACM)
On the decidability of query containment under constraints

PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Equivalences Among Relational Expressions with the Union and Difference Operators

Journal of the ACM (JACM)
Foundations of Databases: The Logical Level

Foundations of Databases: The Logical Level
Conjunctive Query Containment Revisited

ICDT '97 Proceedings of the 6th International Conference on Database Theory
Why and Where: A Characterization of Data Provenance

ICDT '01 Proceedings of the 8th International Conference on Database Theory
Optimal implementation of conjunctive queries in relational data bases

STOC '77 Proceedings of the ninth annual ACM symposium on Theory of computing
Efficient query reformulation in peer data management systems

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Data exchange: getting to the core

ACM Transactions on Database Systems (TODS) - Special Issue: SIGMOD/PODS 2003
A survey of data provenance in e-science

ACM SIGMOD Record
Rewriting queries with arbitrary aggregation functions using views

ACM Transactions on Database Systems (TODS)
Equivalence of queries combining set and bag-set semantics

Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Provenance semirings

Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
On reconciling data exchange, data integration, and peer data management

Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
An annotation management system for relational databases

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Update exchange with mappings and provenance

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Databases with uncertainty and lineage

The VLDB Journal — The International Journal on Very Large Data Bases
Efficient provenance storage

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
On the expressiveness of implicit provenance in query and update languages

ACM Transactions on Database Systems (TODS)
Containment of conjunctive queries on annotated relations

Proceedings of the 12th International Conference on Database Theory
Provenance: a future history

Proceedings of the 24th ACM SIGPLAN conference companion on Object oriented programming systems languages and applications
Efficient querying and maintenance of network provenance at internet-scale

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
The complexity of causality and responsibility for query answers and non-answers

Proceedings of the VLDB Endowment
Relational and XML Data Exchange

Relational and XML Data Exchange

Semiring-annotated data: queries and provenance?

ACM SIGMOD Record
On Provenance Minimization

ACM Transactions on Database Systems (TODS)
Efficient provenance storage for relational queries

Proceedings of the 21st ACM international conference on Information and knowledge management
Ariadne: managing fine-grained provenance on data streams

Proceedings of the 7th ACM international conference on Distributed event-based systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Provenance information has been proved to be very effective in capturing the computational process performed by queries, and has been used extensively as the input to many advanced data management tools (e.g. view maintenance, trust assessment, or query answering in probabilistic databases). We study here the core of provenance information, namely the part of provenance that appears in the computation of every query equivalent to the given one. This provenance core is informative as it describes the part of the computational process that is inherent to the query. It is also useful as a compact input to the above mentioned data management tools. We study algorithms that, given a query, compute an equivalent query that realizes the core provenance for all tuples in its result. We study these algorithms for queries of varying expressive power. Finally, we observe that, in general, one would not want to require database systems to evaluate a specific query that realizes the core provenance, but instead to be able to find, possibly off-line, the core provenance of a given tuple in the output (computed by an arbitrary equivalent query), without rewriting the query. We provide algorithms for such direct computation of the core provenance.