PATRICIA—Practical Algorithm To Retrieve Information Coded in Alphanumeric
Journal of the ACM (JACM)
Tracing the lineage of view data in a warehousing environment
ACM Transactions on Database Systems (TODS)
Rank/select operations on large alphabets: a tool for text indexing
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
MONDRIAN: Annotating and Querying Databases through Colors and Blocks
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Debugging schema mappings with routes
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Intensional associations between data and metadata
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Adaptive searching in succinctly encoded binary relations and tree-structured documents
Theoretical Computer Science
GridDB: a data-centric overlay for scientific grids
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
An annotation management system for relational databases
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Zoom*UserViews: querying relevant provenance in workflow systems
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Provenance and scientific workflows: challenges and opportunities
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Print: a provenance model to support integration processes
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Compact explanation of data fusion decisions
Proceedings of the 22nd international conference on World Wide Web
Hi-index | 0.00 |
While provenance has been extensively studied in the literature, the efficient evaluation of provenance queries remains an open problem. Traditional query optimization techniques, like the use of general-purpose indexes, or the materialization of provenance data, fail on different fronts to address the problem. Therefore, the need to develop provenance-aware access methods becomes apparent. This paper starts by identifying some key requirements that are to a large extent specific to provenance queries and are necessary for their efficient evaluation. The first such property, called duality, requires that a single access method is used to evaluate both backward provenance queries (which input items of some analysis generate an output item) and forward provenance queries (which outputs of some analysis does an input item generate). The second property, called locality, guarantees that provenance query evaluation times should depend mainly on the size of the provenance query results and should be largely independent of the total size of provenance data. Motivated by the above, we identify proper data structures with the aforementioned properties, we implement them, and through a detailed set of experiments, we illustrate their effectiveness on the evaluation of provenance queries.