On the Efficiency of Provenance Queries

  • Authors:
  • Anastasios Kementsietsidis;Min Wang

  • Affiliations:
  • -;-

  • Venue:
  • ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

While models for data provenance have been extensively studied in the literature, the efficient evaluation of the resulting provenance queries remains an open problem. Traditional query optimization techniques, like the use of general-purpose indexes, or the materialization of provenance data, fail on different fronts to address the problem. Provenance-specific optimization techniques, like the use of customized indexes, similarly prove inadequate since the techniques are bound to specific provenance models. Therefore, the need to develop generic provenance-aware techniques quickly becomes apparent.In this paper, we argue for such a generic technique in the form of a provenance index structure that can be used to efficiently evaluate provenance queries ina variety of contexts. By highlighting the limitations of existing techniques, we identify the set of key properties of the generic index, including a novel property called duality which guarantees that the single index can evaluate both backward provenance queries (which data items from a set I are associated with an item from set O) and forward provenance queries (which items from O are associated with an item from I).