Jena: implementing the semantic web recommendations
Proceedings of the 13th international World Wide Web conference on Alternate track papers & posters
Optimized Index Structures for Querying RDF from the Web
LA-WEB '05 Proceedings of the Third Latin American Web Congress
Pattern Recognition, Third Edition
Pattern Recognition, Third Edition
Yago: a core of semantic knowledge
Proceedings of the 16th international conference on World Wide Web
Scalable semantic web data management using vertical partitioning
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Linked data on the web (LDOW2008)
Proceedings of the 17th international conference on World Wide Web
RDF-3X: a RISC-style engine for RDF
Proceedings of the VLDB Endowment
Hexastore: sextuple indexing for semantic web data management
Proceedings of the VLDB Endowment
Column-store support for RDF data management: not all swans are white
Proceedings of the VLDB Endowment
Scalable join processing on very large RDF graphs
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
DOGMA: A Disk-Oriented Graph Matching Algorithm for RDF Databases
ISWC '09 Proceedings of the 8th International Semantic Web Conference
Matrix "Bit" loaded: a scalable lightweight join query processor for RDF data
Proceedings of the 19th international conference on World wide web
Efficient SPARQL query processing in mapreduce through data partitioning and indexing
APWeb'12 Proceedings of the 14th Asia-Pacific international conference on Web Technologies and Applications
Hi-index | 0.00 |
The rapid growth of RDF data in RDF knowledge bases calls for efficient query processing techniques. This paper focuses on the star-style SPARQL join queries, which is very common when users want to search information of entities from RDF knowledge bases. We observe that the computational cost of such queries mainly comes from loading a large portion of predicate-ahead indexes. We therefore propose to partition the whole RDF knowledge bases based on the schema of individual entities, so that only entities of similar schemas are allocated into the same cluster. Such a partitioning strategy generates a pruning mechanism that effectively isolate the correlations of partitions and the queries. Consequently, queries are only conducted over a small number of partitions with small predicate-ahead indexes. Experiments over a large real-life RDF data set show the significant performance improvements achieved by our partitioned indexing techniques.