DBNotes: a post-it system for relational databases based on provenance

  • Authors:
  • Laura Chiticariu;Wang-Chiew Tan;Gaurav Vijayvargiya

  • Affiliations:
  • UC Santa Cruz;UC Santa Cruz;UC Santa Cruz

  • Venue:
  • Proceedings of the 2005 ACM SIGMOD international conference on Management of data
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

We demonstrate DBNotes, a Post-It note system for relational databases where every piece of data may be associated with zero or more notes (or annotations). These annotations are transparently propagated along as data is being transformed. The method by which annotations are propagated is based on provenance (aka lineage): the annotations associated with a piece of data d in the result of a transformation consist of the annotations associated with each piece of data in the source where d is copied from. One immediate application of this system is to use annotations to systematically trace the provenance and flow of data. If every piece of source data is attached with an annotation that describes its address (i.e., origins), then the annotations of a piece of data in the result of a transformation describe its provenance. Hence, one can easily determine the provenance of data through a sequence of transformation steps simply by examining the annotations. Annotations can also be used to store additional information about data. Since a database schema is often proprietary, the ability to insert new information about data without having to change the underlying schema is a useful feature. For example, an error report could be attached to an erroneous piece of data, and this error report will be propagated to other databases along transformations, thus notifying other users of the error. Overall, the annotations on the result of a transformation can also provide an estimate on the quality of the resulting database.