Partitioned indexes for entity search over RDF knowledge bases

Authors:
Fang Du;Yueguo Chen;Xiaoyong Du
Affiliations:
School of Information, Renmin University of China, Beijing, China;Key Laboratory of Data Engineering and Knowledge Engineering, (Renmin University of China), MOE, China;School of Information, Renmin University of China, Beijing, China and Key Laboratory of Data Engineering and Knowledge Engineering, (Renmin University of China), MOE, China
Venue:
DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part I
Year:
2012

Citing 12
Cited 1

Jena: implementing the semantic web recommendations

Proceedings of the 13th international World Wide Web conference on Alternate track papers & posters
Optimized Index Structures for Querying RDF from the Web

LA-WEB '05 Proceedings of the Third Latin American Web Congress
Pattern Recognition, Third Edition

Pattern Recognition, Third Edition
Yago: a core of semantic knowledge

Proceedings of the 16th international conference on World Wide Web
Scalable semantic web data management using vertical partitioning

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Linked data on the web (LDOW2008)

Proceedings of the 17th international conference on World Wide Web
RDF-3X: a RISC-style engine for RDF

Proceedings of the VLDB Endowment
Hexastore: sextuple indexing for semantic web data management

Proceedings of the VLDB Endowment
Column-store support for RDF data management: not all swans are white

Proceedings of the VLDB Endowment
Scalable join processing on very large RDF graphs

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
DOGMA: A Disk-Oriented Graph Matching Algorithm for RDF Databases

ISWC '09 Proceedings of the 8th International Semantic Web Conference
Matrix "Bit" loaded: a scalable lightweight join query processor for RDF data

Proceedings of the 19th international conference on World wide web

Efficient SPARQL query processing in mapreduce through data partitioning and indexing

APWeb'12 Proceedings of the 14th Asia-Pacific international conference on Web Technologies and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

The rapid growth of RDF data in RDF knowledge bases calls for efficient query processing techniques. This paper focuses on the star-style SPARQL join queries, which is very common when users want to search information of entities from RDF knowledge bases. We observe that the computational cost of such queries mainly comes from loading a large portion of predicate-ahead indexes. We therefore propose to partition the whole RDF knowledge bases based on the schema of individual entities, so that only entities of similar schemas are allocated into the same cluster. Such a partitioning strategy generates a pruning mechanism that effectively isolate the correlations of partitions and the queries. Consequently, queries are only conducted over a small number of partitions with small predicate-ahead indexes. Experiments over a large real-life RDF data set show the significant performance improvements achieved by our partitioned indexing techniques.