Scalable Multi-query Optimization for SPARQL

Authors:
Wangchao Le;Anastasios Kementsietsidis;Songyun Duan;Feifei Li
Affiliations:
-;-;-;-
Venue:
ICDE '12 Proceedings of the 2012 IEEE 28th International Conference on Data Engineering
Year:
2012

Citing 0
Cited 5

SPAM: a SPARQL analysis and manipulation tool

Proceedings of the VLDB Endowment
Queen-bee: query interaction-aware for buffer allocation and scheduling problem

DaWaK'12 Proceedings of the 14th international conference on Data Warehousing and Knowledge Discovery
Optimizing RDF(S) queries on cloud platforms

Proceedings of the 22nd international conference on World Wide Web companion
Efficient social network data query processing on MapReduce

Proceedings of the 5th ACM workshop on HotPlanet
RDF analytics: lenses over semantic graphs

Proceedings of the 23rd international conference on World wide web

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper revisits the classical problem of multi-query optimization in the context of RDF/SPARQL. We show that the techniques developed for relational and semi-structured data/query languages are hard, if not impossible, to be extended to account for RDF data model and graph query patterns expressed in SPARQL. In light of the NP-hardness of the multi-query optimization for SPARQL, we propose heuristic algorithms that partition the input batch of queries into groups such that each group of queries can be optimized together. An essential component of the optimization incorporates an efficient algorithm to discover the common sub-structures of multiple SPARQL queries and an effective cost model to compare candidate execution plans. Since our optimization techniques do not make any assumption about the underlying SPARQL query engine, they have the advantage of being portable across different RDF stores. The extensive experimental studies, performed on three popular RDF stores, show that the proposed techniques are effective, efficient and scalable.