Principles of database and knowledge-base systems, Vol. I
Principles of database and knowledge-base systems, Vol. I
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
A new algorithm for the maximum-weight clique problem
Nordic Journal of Computing
C-store: a column-oriented DBMS
VLDB '05 Proceedings of the 31st international conference on Very large data bases
An efficient SQL-based RDF querying scheme
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Scalable semantic web data management using vertical partitioning
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
SPARQL basic graph pattern optimization using selectivity estimation
Proceedings of the 17th international conference on World Wide Web
The SPARQL Query Graph Model for Query Optimization
ESWC '07 Proceedings of the 4th European conference on The Semantic Web: Research and Applications
RDF-3X: a RISC-style engine for RDF
Proceedings of the VLDB Endowment
Hexastore: sextuple indexing for semantic web data management
Proceedings of the VLDB Endowment
Column-store support for RDF data management: not all swans are white
Proceedings of the VLDB Endowment
SW-Store: a vertically partitioned DBMS for Semantic Web data management
The VLDB Journal — The International Journal on Very Large Data Bases
SP^2Bench: A SPARQL Performance Benchmark
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Scalable join processing on very large RDF graphs
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Media Meets Semantic Web --- How the BBC Uses DBpedia and Linked Data to Make Connections
ESWC 2009 Heraklion Proceedings of the 6th European Semantic Web Conference on The Semantic Web: Research and Applications
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
The RDF-3X engine for scalable management of RDF data
The VLDB Journal — The International Journal on Very Large Data Bases
Matrix "Bit" loaded: a scalable lightweight join query processor for RDF data
Proceedings of the 19th international conference on World wide web
YARS2: a federated repository for querying graph structured data from the web
ISWC'07/ASWC'07 Proceedings of the 6th international The semantic web and 2nd Asian conference on Asian semantic web conference
HPRD: a high performance RDF database
NPC'07 Proceedings of the 2007 IFIP international conference on Network and parallel computing
Foundations of SPARQL query optimization
Proceedings of the 13th International Conference on Database Theory
Apples and oranges: a comparison of RDF benchmarks and real RDF datasets
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Characteristic sets: Accurate cardinality estimation for RDF queries with multiple joins
ICDE '11 Proceedings of the 2011 IEEE 27th International Conference on Data Engineering
Heuristics-Based Query Processing for Large RDF Graphs Using Cloud Computing
IEEE Transactions on Knowledge and Data Engineering
Benchmarking database representations of RDF/S stores
ISWC'05 Proceedings of the 4th international conference on The Semantic Web
Efficiently joining group patterns in SPARQL queries
ESWC'10 Proceedings of the 7th international conference on The Semantic Web: research and Applications - Volume Part I
Diachronic linked data: towards long-term preservation of structured interrelated information
Proceedings of the First International Workshop on Open Data
Building an efficient RDF store over a relational database
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Hi-index | 0.00 |
Query optimization in RDF Stores is a challenging problem as SPARQL queries typically contain many more joins than equivalent relational plans, and hence lead to a large join order search space. In such cases, cost-based query optimization often is not possible. One practical reason for this is that statistics typically are missing in web scale setting such as the Linked Open Datasets (LOD). The more profound reason is that due to the absence of schematic structure in RDF, join-hit ratio estimation requires complicated forms of correlated join statistics; and currently there are no methods to identify the relevant correlations beforehand. For this reason, the use of good heuristics is essential in SPARQL query optimization, even in the case that are partially used with cost-based statistics (i.e., hybrid query optimization). In this paper we describe a set of useful heuristics for SPARQL query optimizers. We present these in the context of a new Heuristic SPARQL Planner (HSP) that is capable of exploiting the syntactic and the structural variations of the triple patterns in a SPARQL query in order to choose an execution plan without the need of any cost model. For this, we define the variable graph and we show a reduction of the SPARQL query optimization problem to the maximum weight independent set problem. We implemented our planner on top of the MonetDB open source column-store and evaluated its effectiveness against the state-of-the-art RDF-3X engine as well as comparing the plan quality with a relational (SQL) equivalent of the benchmarks.