Performance evaluation of the karma provenance framework for scientific workflows

Authors:
Yogesh L. Simmhan;Beth Plale;Dennis Gannon;Suresh Marru
Affiliations:
Indiana University, Bloomington, IN;Indiana University, Bloomington, IN;Indiana University, Bloomington, IN;Indiana University, Bloomington, IN
Venue:
IPAW'06 Proceedings of the 2006 international conference on Provenance and Annotation of Data
Year:
2006

Citing 13
Cited 17

Lineage retrieval for scientific data processing: a survey

ACM Computing Surveys (CSUR)
A survey of data provenance in e-science

ACM SIGMOD Record
Towards a Quality Model for Effective Data Selection in Collaboratories

ICDEW '06 Proceedings of the 22nd International Conference on Data Engineering Workshops
WS-Messenger: A Web Services-Based Messaging System for Service-Oriented Grid Computing

CCGRID '06 Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid
Building web services for scientific grid applications

IBM Journal of Research and Development
A Framework for Collecting Provenance in Data-Centric Scientific Workflows

ICWS '06 Proceedings of the IEEE International Conference on Web Services
Recording and using provenance in a protein compressibility experiment

HPDC '05 Proceedings of the High Performance Distributed Computing, 2005. HPDC-14. Proceedings. 14th IEEE International Symposium
Towards dynamically adaptive weather analysis and forecasting in LEAD

ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part II
Managing rapidly-evolving scientific workflows

IPAW'06 Proceedings of the 2006 international conference on Provenance and Annotation of Data
Provenance collection support in the kepler scientific workflow system

IPAW'06 Proceedings of the 2006 international conference on Provenance and Annotation of Data
Applying the virtual data provenance model

IPAW'06 Proceedings of the 2006 international conference on Provenance and Annotation of Data
Issues in automatic provenance collection

IPAW'06 Proceedings of the 2006 international conference on Provenance and Annotation of Data
An identity crisis in the life sciences

IPAW'06 Proceedings of the 2006 international conference on Provenance and Annotation of Data

A time-and-value centric provenance model and architecture for medical event streams

Proceedings of the 1st ACM SIGMOBILE international workshop on Systems and networking support for healthcare and assisted living environments
Provenance and scientific workflows: challenges and opportunities

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
A model of process documentation to determine provenance in mash-ups

ACM Transactions on Internet Technology (TOIT)
A Logic Programming Approach to Scientific Workflow Provenance Querying

Provenance and Annotation of Data and Processes
Implementation and Evaluation of a Protocol for Recording Process Documentation in the Presence of Failures

Provenance and Annotation of Data and Processes
Advances and Challenges for Scalable Provenance in Stream Processing Systems

Provenance and Annotation of Data and Processes
On using provenance data to increase the reliability of ubiquitous computing environments

Proceedings of the 10th International Conference on Information Integration and Web-based Applications & Services
Workflows and e-Science: An overview of workflow system features and capabilities

Future Generation Computer Systems
Recording Process Documentation in the Presence of Failures

Methods, Models and Tools for Fault Tolerance
Techniques for efficiently querying scientific workflow provenance graphs

Proceedings of the 13th International Conference on Extending Database Technology
Provenance tracking in the virolab virtual laboratory

PPAM'07 Proceedings of the 7th international conference on Parallel processing and applied mathematics
The Foundations for Provenance on the Web

Foundations and Trends in Web Science
Provenance security guarantee from origin up to now in the e-Science environment

Journal of Systems Architecture: the EUROMICRO Journal
Online workflow management and performance analysis with stampede

Proceedings of the 7th International Conference on Network and Services Management
Provenance implementation in a scientific simulation environment

IPAW'06 Proceedings of the 2006 international conference on Provenance and Annotation of Data
Towards low overhead provenance tracking in near real-time stream filtering

IPAW'06 Proceedings of the 2006 international conference on Provenance and Annotation of Data
Capturing and querying workflow runtime provenance with PROV: a practical approach

Proceedings of the Joint EDBT/ICDT 2013 Workshops

Quantified Score

Hi-index	0.00

Visualization

Abstract

Provenance about workflow executions and data derivations in scientific applications help estimate data quality, track resources, and validate in silico experiments. The Karma provenance framework provides a means to collect workflow, process, and data provenance from data-driven scientific workflows and is used in the Linked Environments for Atmospheric Discovery (LEAD) project. This article presents a performance analysis of the Karma service as compared against the contemporary PReServ provenance service. Our study finds that Karma scales exceedingly well for collecting and querying provenance records, showing linear or sub-linear scaling with increasing number of provenance records and clients when tested against workloads in the order of 10,000 application-service invocations and over 36 concurrent clients.