Modeling and querying probabilistic RDFS data sets with correlated triples

  • Authors:
  • Chi-Cheong Szeto;Edward Hung;Yu Deng

  • Affiliations:
  • Department of Computing, The Hong Kong Polytechnic University, Hong Kong;Department of Computing, The Hong Kong Polytechnic University, Hong Kong;IBM T.J. Watson Research Center, Yorktown Heights, NY

  • Venue:
  • APWeb'11 Proceedings of the 13th Asia-Pacific web conference on Web technologies and applications
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Resource Description Framework (RDF) and its extension RDF Schema (RDFS) are data models to represent information on the Web. They use RDF triples to make statements. Because of lack of knowledge, some triples are known to be true with a certain degree of belief. Existing approaches either assign each triple a probability and assume that triples are statistically independent of each other, or only model statistical relationships over possible objects of a triple. In this paper, we introduce probabilistic RDFS (pRDFS) to model statistical relationships among correlated triples by specifying the joint probability distributions over them. Syntax and semantics of pRDFS are given. Since there may exist some truth value assignments for triples that violate the RDFS semantics, an algorithm to check the consistency is provided. Finally, we show how to find answers to queries in SPARQL. The probabilities of the answers are approximated using a Monte-Carlo algorithm.