A model of process documentation to determine provenance in mash-ups

Authors:
Paul Groth;Simon Miles;Luc Moreau
Affiliations:
University of Southern California, Marina del Ray, CA;Kings College London, United Kingdom;University of Southampton, Marina del Ray, CA
Venue:
ACM Transactions on Internet Technology (TOIT)
Year:
2009

Citing 35
Cited 11

Debugging heterogeneous distributed systems using event-based models of behavior

ACM Transactions on Computer Systems (TOCS)
Automated planning

ACM Computing Surveys (CSUR)
Garbage collection: algorithms for automatic dynamic memory management

Garbage collection: algorithms for automatic dynamic memory management
Executable workflows: a paradigm for collaborative design on the Internet

DAC '97 Proceedings of the 34th annual Design Automation Conference
The grid: blueprint for a new computing infrastructure

The grid: blueprint for a new computing infrastructure
UML in action

Communications of the ACM
Tracing the lineage of view data in a warehousing environment

ACM Transactions on Database Systems (TODS)
Time, clocks, and the ordering of events in a distributed system

Communications of the ACM
Distributed Algorithms

Distributed Algorithms
Lineage tracing for general data warehouse transformations

The VLDB Journal — The International Journal on Very Large Data Bases
The next step in Web services

Communications of the ACM - Service-oriented computing
Performance debugging for distributed systems of black boxes

SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Lineage retrieval for scientific data processing: a survey

ACM Computing Surveys (CSUR)
A Provenance-Aware Weighted Fault Tolerance Scheme for Service-Based Applications

ISORC '05 Proceedings of the Eighth IEEE International Symposium on Object-Oriented Real-Time Distributed Computing
Named graphs, provenance and trust

WWW '05 Proceedings of the 14th international conference on World Wide Web
A survey of data provenance in e-science

ACM SIGMOD Record
On the design of a pervasive debugger

Proceedings of the sixth international symposium on Automated analysis-driven debugging
Automated Syntactic Medation forWeb Service Integration

ICWS '06 Proceedings of the IEEE International Conference on Web Services
Inferring binary trust relationships in Web-based social networks

ACM Transactions on Internet Technology (TOIT)
Scaling System-Level Science: Scientific Exploration and IT Implications

Computer
Provenance in Agent-Mediated Healthcare Systems

IEEE Intelligent Systems
PrIMe: a software engineering methodology for developing provenance-aware applications

Proceedings of the 6th international workshop on Software engineering and middleware
Provenance-based validation of e-science experiments

Web Semantics: Science, Services and Agents on the World Wide Web
Navigating Provenance Information for Distributed Healthcare Management

WI '06 Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence
Bringing Semantics to Web Services with OWL-S

World Wide Web
Recording and using provenance in a protein compressibility experiment

HPDC '05 Proceedings of the High Performance Distributed Computing, 2005. HPDC-14. Proceedings. 14th IEEE International Symposium
Extracting causal graphs from an open provenance data model

Concurrency and Computation: Practice & Experience - The First Provenance Challenge
Flexible provisioning of web service workflows

ACM Transactions on Internet Technology (TOIT)
Provenance implementation in a scientific simulation environment

IPAW'06 Proceedings of the 2006 international conference on Provenance and Annotation of Data
Harvesting RDF triples

IPAW'06 Proceedings of the 2006 international conference on Provenance and Annotation of Data
Combining provenance with trust in social networks for semantic web content filtering

IPAW'06 Proceedings of the 2006 international conference on Provenance and Annotation of Data
Provenance collection support in the kepler scientific workflow system

IPAW'06 Proceedings of the 2006 international conference on Provenance and Annotation of Data
Electronically querying for the provenance of entities

IPAW'06 Proceedings of the 2006 international conference on Provenance and Annotation of Data
Security issues in a SOA-Based provenance system

IPAW'06 Proceedings of the 2006 international conference on Provenance and Annotation of Data
Performance evaluation of the karma provenance framework for scientific workflows

IPAW'06 Proceedings of the 2006 international conference on Provenance and Annotation of Data

Provenance and the Price of Identity

Provenance and Annotation of Data and Processes
Pipeline-centric provenance model

Proceedings of the 4th Workshop on Workflows in Support of Large-Scale Science
e-BioFlow: improving practical use of workflow systems in bioinformatics

ITBAM'10 Proceedings of the First international conference on Information technology in bio- and medical informatics
The Foundations for Provenance on the Web

Foundations and Trends in Web Science
W3P: Building an OPM based provenance model for the Web

Future Generation Computer Systems
Storing, reasoning, and querying OPM-compliant scientific workflow provenance using relational databases

Future Generation Computer Systems
The Open Provenance Model core specification (v1.1)

Future Generation Computer Systems
Representing distributed systems using the Open Provenance Model

Future Generation Computer Systems
Linked provenance data: A semantic Web-based approach to interoperable workflow traces

Future Generation Computer Systems
PrIMe: A methodology for developing provenance-aware applications

ACM Transactions on Software Engineering and Methodology (TOSEM)
A data management system for ab-initio nuclear physics applications

Proceedings of the 19th High Performance Computing Symposia

Quantified Score

Hi-index	0.00

Visualization

Abstract

Through technologies such as RSS (Really Simple Syndication), Web Services, and AJAX (Asynchronous JavaScript and XML), the Internet has facilitated the emergence of applications that are composed from a variety of services and data sources. Through tools such as Yahoo Pipes, these “mash-ups” can be composed in a dynamic, just-in-time manner from components provided by multiple institutions (i.e., Google, Amazon, your neighbor). However, when using these applications, it is not apparent where data comes from or how it is processed. Thus, to inspire trust and confidence in mash-ups, it is critical to be able to analyze their processes after the fact. These trailing analyses, in particular the determination of the provenance of a result (i.e., the process that led to it), are enabled by process documentation, which is documentation of an application's past process created by the components of that application at execution time. In this article, we define a generic conceptual data model that supports the autonomous creation of attributable, factual process documentation for dynamic multi-institutional applications. The data model is instantiated using two Internet formats, OWL and XML, and is evaluated with respect to questions about the provenance of results generated by a complex bioinformatics mash-up.