Predicting Missing Provenance Using Semantic Associations in Reservoir Engineering

  • Authors:
  • Jing Zhao;Karthik Gomadam;Viktor Prasanna

  • Affiliations:
  • -;-;-

  • Venue:
  • ICSC '11 Proceedings of the 2011 IEEE Fifth International Conference on Semantic Computing
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Provenance is becoming an important issue as a reliable estimator of data quality. However, provenance collection mechanisms in the reservoir engineering domain often result in missing provenance information. In this paper, we address the problem of predicting missing provenance information in reservoir engineering. Based on the observation that data items with specific semantic "connections" may share the same provenance, our approach annotates data items with domain entities defined in a domain ontology, and represent these "connections" as sequences of relationships (also known as semantic associations) in the ontology graph. By analyzing annotated historical datasets with complete provenance information, we capture semantic associations that may imply identical provenance. A statistical analysis is applied to assign confidence values to the discovered associations, which indicate the trust of each association when it is used for future provenance prediction. The semantic associations, along with their confidence measures, are then used by a voting algorithm to predict the missing provenance information. Our evaluation shows that the average precision of our approach is above 85% when one third of the provenance information is missing.