To cache or not to cache: the effects of warming cache in complex SPARQL queries

Authors:
Tomas Lampo;María-Esther Vidal;Juan Danilow;Edna Ruckhaus
Affiliations:
University of Maryland, College Park;Universidad Simón Bolívar, Caracas, Venezuela;Universidad Simón Bolívar, Caracas, Venezuela;Universidad Simón Bolívar, Caracas, Venezuela
Venue:
OTM'11 Proceedings of the 2011th Confederated international conference on On the move to meaningful internet systems - Volume Part II
Year:
2011

Citing 22
Cited 0

Efficient worst case timing analysis of data caching

RTAS '96 Proceedings of the 2nd IEEE Real-Time Technology and Applications Symposium (RTAS '96)
A Requirements Driven Framework for Benchmarking Semantic Web Knowledge Base Systems

IEEE Transactions on Knowledge and Data Engineering
Scalable semantic web data management using vertical partitioning

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Query evaluation and optimization in the semantic web

Theory and Practice of Logic Programming
RDF-3X: a RISC-style engine for RDF

Proceedings of the VLDB Endowment
Hexastore: sextuple indexing for semantic web data management

Proceedings of the VLDB Endowment
Column-store support for RDF data management: not all swans are white

Proceedings of the VLDB Endowment
An Experimental Comparison of RDF Data Management Approaches in a SPARQL Benchmark Scenario

ISWC '08 Proceedings of the 7th International Conference on The Semantic Web
SW-Store: a vertically partitioned DBMS for Semantic Web data management

The VLDB Journal — The International Journal on Very Large Data Bases
Automated physical design in database caches

ICDEW '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering Workshop
Self-organizing tuple reconstruction in column-stores

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Scalable join processing on very large RDF graphs

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
A Rule System for Querying Persistent RDFS Data

ESWC 2009 Heraklion Proceedings of the 6th European Semantic Web Conference on The Semantic Web: Research and Applications
Scalable indexing of RDF graphs for efficient join processing

Proceedings of the 18th ACM conference on Information and knowledge management
LUBM: A benchmark for OWL knowledge base systems

Web Semantics: Science, Services and Agents on the World Wide Web
Matrix "Bit" loaded: a scalable lightweight join query processor for RDF data

Proceedings of the 19th international conference on World wide web
YARS2: a federated repository for querying graph structured data from the web

ISWC'07/ASWC'07 Proceedings of the 6th international The semantic web and 2nd Asian conference on Asian semantic web conference
Caching intermediate result of SPARQL queries

Proceedings of the 20th international conference companion on World wide web
Enabling fine-grained HTTP caching of SPARQL query results

ISWC'11 Proceedings of the 10th international conference on The semantic web - Volume Part I
An optimised semantic web query language implementation in prolog

ICLP'05 Proceedings of the 21st international conference on Logic Programming
Improving the performance of semantic web applications with SPARQL query caching

ESWC'10 Proceedings of the 7th international conference on The Semantic Web: research and Applications - Volume Part II
Efficiently joining group patterns in SPARQL queries

ESWC'10 Proceedings of the 7th international conference on The Semantic Web: research and Applications - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

Existing RDF engines have developed caching techniques able to store intermediate results and reuse them in further steps of the query execution process; thus, execution time is speeded up by avoiding repeated computation of the same results. Although these techniques can be beneficial for many real-world queries, the same effects may not be observed in complex queries. Particularly, queries comprised of a large number of graph patterns that require the computation of large sets of intermediate results that cannot be reused, or queries that require complex computations to produce small amounts of data, may require further re-orderings or groupings in order to make an effective usage of the cache. In this paper, we address the problem of determining a type of SPARQL queries that can benefit from caching data during query execution or warming up cache. We report on experimental results that show that complex queries can take advantage of the cache, if they are reordered and grouped according to small-sized star-shaped groups; complex queries are not only comprised of a large number of patterns, but they may also produce a large number of intermediate results. Although the results are preliminary, they clearly show that star-shaped group queries can speed up execution time by up to three orders of magnitude when they are run in warm cache, while original queries may exhibit poor performance in warm cache.