Harvesting RDF triples

Authors:
Joe Futrelle
Affiliations:
National Center for Supercomputing Applications, Urbana, IL
Venue:
IPAW'06 Proceedings of the 2006 international conference on Provenance and Annotation of Data
Year:
2006

Citing 4
Cited 5

The open archives initiative: building a low-barrier interoperability framework

Proceedings of the 1st ACM/IEEE-CS joint conference on Digital libraries
Preferential reasoning on a web of trust

ISWC'05 Proceedings of the 4th international conference on The Semantic Web
Reasoning with multi-version ontologies: a temporal logic approach

ISWC'05 Proceedings of the 4th international conference on The Semantic Web
Provenance-based validation of e-science experiments

ISWC'05 Proceedings of the 4th international conference on The Semantic Web

A model of process documentation to determine provenance in mash-ups

ACM Transactions on Internet Technology (TOIT)
The Foundations for Provenance on the Web

Foundations and Trends in Web Science
W3P: Building an OPM based provenance model for the Web

Future Generation Computer Systems
Applying provenance in distributed organ transplant management

IPAW'06 Proceedings of the 2006 international conference on Provenance and Annotation of Data
An identity crisis in the life sciences

IPAW'06 Proceedings of the 2006 international conference on Provenance and Annotation of Data

Quantified Score

Hi-index	0.00

Visualization

Abstract

Managing scientific data requires tools that can track complex provenance information about digital resources and workflows. RDF triples are a convenient abstraction for combining independently-generated factual statements, including statements about provenance[1]. Harvesting is a strategy for asynchronously acquiring distributed information for the purposes of aggregation and analysis[2]. Harvesting typically requires that information be temporally scoped and attributed to some creator or information source. An RDF triple asserts a fact without attributing it to any actor or period of time, so the abstraction must be extended to support typical harvesting scenarios. This paper compares standard, conventional, and non-standard means of extending RDF triples to associate them with attribution and timing information. Then, it considers the implications of these techniques for harvesting and presents some implementation sketches based on a journaling strategy.