Job Provenance --- Insight into Very Large Provenance Datasets

  • Authors:
  • Aleš Křenek;Luděk Matyska;Jiří Sitera;Miroslav Ruda;František Dvořák;Jiří Filipovič;Zdeněk Šustr;Zdeněk Salvet

  • Affiliations:
  • CESNET z.s.p.o., Praha 6, Czech Republic 160 00 and Institute of Computer Science, Masaryk University, Brno, Czech Republic 602 00;CESNET z.s.p.o., Praha 6, Czech Republic 160 00 and Institute of Computer Science, Masaryk University, Brno, Czech Republic 602 00;CESNET z.s.p.o., Praha 6, Czech Republic 160 00;CESNET z.s.p.o., Praha 6, Czech Republic 160 00 and Institute of Computer Science, Masaryk University, Brno, Czech Republic 602 00;CESNET z.s.p.o., Praha 6, Czech Republic 160 00;CESNET z.s.p.o., Praha 6, Czech Republic 160 00;CESNET z.s.p.o., Praha 6, Czech Republic 160 00;CESNET z.s.p.o., Praha 6, Czech Republic 160 00 and Institute of Computer Science, Masaryk University, Brno, Czech Republic 602 00

  • Venue:
  • Provenance and Annotation of Data and Processes
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Following the job-centric monitoring concept, Job Provenance (JP) service organizes provenance records on the per-job basis. It is designed to manage very large number of records, as was required in the EGEE project where it was developed originally. The quantitative aspect is also a focus of the presented demonstration. We show JP capability to retrieve data items of interest from a large dataset of full records of more than 1 million of jobs, to perform non-trivial transformation on those data, and organize the results in such a way that repeated interactive queries are possible. The application area of the demo is derived from that of previous Provenance Challenges. Though the topic of the demo -- a computational experiment -- is arranged rather artificially, the demonstration still delivers its main message that JP supports non-trivial transformations and interactive queries on large data sets.