Partitioned indexes for entity search over RDF knowledge bases

  • Authors:
  • Fang Du;Yueguo Chen;Xiaoyong Du

  • Affiliations:
  • School of Information, Renmin University of China, Beijing, China;Key Laboratory of Data Engineering and Knowledge Engineering, (Renmin University of China), MOE, China;School of Information, Renmin University of China, Beijing, China and Key Laboratory of Data Engineering and Knowledge Engineering, (Renmin University of China), MOE, China

  • Venue:
  • DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part I
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

The rapid growth of RDF data in RDF knowledge bases calls for efficient query processing techniques. This paper focuses on the star-style SPARQL join queries, which is very common when users want to search information of entities from RDF knowledge bases. We observe that the computational cost of such queries mainly comes from loading a large portion of predicate-ahead indexes. We therefore propose to partition the whole RDF knowledge bases based on the schema of individual entities, so that only entities of similar schemas are allocated into the same cluster. Such a partitioning strategy generates a pruning mechanism that effectively isolate the correlations of partitions and the queries. Consequently, queries are only conducted over a small number of partitions with small predicate-ahead indexes. Experiments over a large real-life RDF data set show the significant performance improvements achieved by our partitioned indexing techniques.