Provenance traces of the swift parallel scripting system

  • Authors:
  • Luiz M. R. Gadelha, Jr.;Michael Wilde;Marta Mattoso;Ian Foster

  • Affiliations:
  • National Laboratory for Scientific Computing, Brazil;Argonne National Laboratory and University of Chicago;Federal University of Rio de Janeiro, Brazil;Argonne National Laboratory and University of Chicago

  • Venue:
  • Proceedings of the Joint EDBT/ICDT 2013 Workshops
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this abstract, we describe provenance traces generated from executions of scientific workflows managed by the Swift parallel scripting system. They follow a provenance data model, used by MTCProv, the provenance management component of Swift. It is similar to PROV, representing most of its core concepts and including additional information about the scientific domain, computational resource consumption, and prospective provenance. We describe provenance queries that follow patterns commonly found in high performance computing and that are straightforward to support with MTCProv's built-in procedures. These queries often involve costly relational join operations and recursion, providing a relevant case for benchmarking.