Provenance as dependency analysis

  • Authors:
  • James Cheney;Amal Ahmed;Umut a. Acar

  • Affiliations:
  • Laboratory for foundations of computer science, university of edinburgh, informatics forum, 10 crichton street, edinburgh eh8 9ab, scotland email: j.cheney@inf.ed.ac.uk;School of informatics and computing, indiana university, 150 s. woodlawn ave., bloomington, in 47405, u.s.a. email: amal@cs.indiana.edu;Max planck institute for software systems, gottlieb-daimler-strasse, building 49, d67663 kaiserslautern, germany email: umut@mpi-sws.org

  • Venue:
  • Mathematical Structures in Computer Science - Programming Language Interference and Dependence
  • Year:
  • 2011

Quantified Score

Hi-index 0.03

Visualization

Abstract

Provenance is information recording the source, derivation or history of some information. Provenance tracking has been studied in a variety of settings, particularly database management systems. However, although many candidate definitions of provenance have been proposed, the mathematical or semantic foundations of data provenance have received comparatively little attention. In this paper, we argue that dependency analysis techniques familiar from program analysis and program slicing provide a formal foundation for forms of provenance that are intended to show how (part of) the output of a query depends on (parts of) its input. We introduce a semantic characterisation of such dependency provenance for a core database query language, show that minimal dependency provenance is not computable, and provide dynamic and static approximation techniques. We also discuss preliminary implementation experience with using dependency provenance to compute data slices, or summaries of the parts of the input relevant to a given part of the output.