A conceptual model and predicate language for data selection and projection based on provenance

  • Authors:
  • David W. Archer;Lois M. L. Delcambre

  • Affiliations:
  • Department of Computer Science, Portland State University, Portland, OR;Department of Computer Science, Portland State University, Portland, OR

  • Venue:
  • TAPP'10 Proceedings of the 2nd conference on Theory and practice of provenance
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Writing relational database queries over current provenance databases can be complex and error-prone because application data is typically mixed with provenance data, because queries may require recursion, and because the form in which provenance is maintained requires procedural parsing not easily framed in query syntax. As a result, it is often difficult to write queries that select (rows or columns of) data based on provenance. In this paper, we contribute a conceptual model and a predicate language for use in relational algebra that allows the user to write simple, nonrecursive queries to select data and attributes based on provenance. Our model also includes novel data and provenance features, including multi-valued attributes, that are useful for data curation settings. We show that our predicate language supports a broad class of queries that select application data based on provenance. We also show how selection of data with our language extensions can be emulated with an existing graph database system and its associated query language.