Query capabilities of the Karma provenance framework

Authors:
Yogesh L. Simmhan;Beth Plale;Dennis Gannon
Affiliations:
Computer Science Department, Indiana University, Bloomington, IN 47405, U.S.A.;Computer Science Department, Indiana University, Bloomington, IN 47405, U.S.A.;Computer Science Department, Indiana University, Bloomington, IN 47405, U.S.A.
Venue:
Concurrency and Computation: Practice & Experience - The First Provenance Challenge
Year:
2008

Citing 0
Cited 15

Provenance Querying for End-Users: A Drug Resistance Case Study

ICCS '08 Proceedings of the 8th international conference on Computational Science, Part III
Atomicity and provenance support for pipelined scientific workflows

Future Generation Computer Systems
Service architectures for e-Science grid gateways: opportunities and challenges

OTM'07 Proceedings of the 2007 OTM confederated international conference on On the move to meaningful internet systems: CoopIS, DOA, ODBASE, GADA, and IS - Volume Part II
RDFProv: A relational RDF store for querying and managing scientific workflow provenance

Data & Knowledge Engineering
Supporting retrieval of diverse biomedical data using evidence-aware queries

Journal of Biomedical Informatics
Research issues in data provenance

Proceedings of the 48th Annual Southeast Regional Conference
The Foundations for Provenance on the Web

Foundations and Trends in Web Science
Storing, reasoning, and querying OPM-compliant scientific workflow provenance using relational databases

Future Generation Computer Systems
Representing distributed systems using the Open Provenance Model

Future Generation Computer Systems
Workflows to open provenance graphs, round-trip

Future Generation Computer Systems
A semantic multi-agent system for intelligent and adaptive scientific workflows

Proceedings of the 4th International Workshop on Semantic Web Applications and Tools for the Life Sciences
Toward the modeling of data provenance in scientific publications

Computer Standards & Interfaces
A framework for scalable distributed provenance storage system

Computer Standards & Interfaces
Provenance in sensor data management

Communications of the ACM
Provenance in Sensor Data Management

Queue - Large-Scale Implementations

Quantified Score

Hi-index	0.02

Visualization

Abstract

Provenance metadata in e-Science captures the derivation history of data products generated from scientific workflows. Provenance forms a glue linking workflow execution with associated data products, and finds use in determining the quality of derived data, tracking resource usage, and for verifying and validating scientific experiments. In this article, we discuss the scope of provenance collected in the Karma provenance framework used in the LEAD Cyberinfrastructure project, distinguishing provenance metadata from generic annotations. We further describe our approaches to querying for different forms of provenance in Karma in the context of queries in the first provenance challenge. We use an incremental, building-block method to construct provenance queries based on the fundamental querying capabilities provided by the Karma service centered on the provenance data model. This has the advantage of keeping the Karma service generic and simple, and yet supports a wide range of queries. Karma successfully answers all but one challenge query. Copyright © 2007 John Wiley & Sons, Ltd.