ES3: A Demonstration of Transparent Provenance for Scientific Computation

  • Authors:
  • James Frew;Peter Slaughter

  • Affiliations:
  • Donald Bren School of Environmental Science and Management, University of California, Santa Barbara, USA CA 93106-5131;Donald Bren School of Environmental Science and Management, University of California, Santa Barbara, USA CA 93106-5131

  • Venue:
  • Provenance and Annotation of Data and Processes
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

The Earth System Science Server (ES3) is a software environment for data-intensive Earth science, with unique capabilities for automatically and transparently capturing and managing the provenance of arbitrary computations. Transparent acquisition avoids the scientist having to express their computations in specific languages or schemas for provenance to be available. ES3 models provenance as relationships between processes and their input and output files. These relationships are captured by monitoring read and write accesses at various levels in the science software and asynchronously converting them to time-ordered streams of provenance events which are stored in an XML database. An ES3 provenance query returns an XML serialization of a provenance graph, forward or backwards from a specified process or file. We demonstrate ES3 provenance by generating complex data products from Earth satellite imagery.