Hybrid SPARQL queries: fresh vs. fast results

Authors:
Jürgen Umbrich;Marcel Karnstedt;Aidan Hogan;Josiane Xavier Parreira
Affiliations:
Digital Enterprise Research Institute, National University of Ireland, Galway, Ireland;Digital Enterprise Research Institute, National University of Ireland, Galway, Ireland;Digital Enterprise Research Institute, National University of Ireland, Galway, Ireland;Digital Enterprise Research Institute, National University of Ireland, Galway, Ireland
Venue:
ISWC'12 Proceedings of the 11th international conference on The Semantic Web - Volume Part I
Year:
2012

Citing 17
Cited 1

A survey of Web cache replacement strategies

ACM Computing Surveys (CSUR)
Index structures and algorithms for querying distributed RDF repositories

Proceedings of the 13th international conference on World Wide Web
Executing SPARQL Queries over the Web of Linked Data

ISWC '09 Proceedings of the 8th International Semantic Web Conference
Sindice.com: weaving the open linked data

ISWC'07/ASWC'07 Proceedings of the 6th international The semantic web and 2nd Asian conference on Asian semantic web conference
Querying distributed RDF data sources with SPARQL

ESWC'08 Proceedings of the 5th European semantic web conference on The semantic web: research and applications
Linked data query processing strategies

ISWC'10 Proceedings of the 9th international semantic web conference on The semantic web - Volume Part I
Using reformulation trees to optimize queries over distributed heterogeneous sources

ISWC'10 Proceedings of the 9th international semantic web conference on The semantic web - Volume Part I
Summary models for routing keywords to linked data sources

ISWC'10 Proceedings of the 9th international semantic web conference on The semantic web - Volume Part I
SIHJoin: querying remote and local linked data

ESWC'11 Proceedings of the 8th extended semantic web conference on The semantic web: research and applications - Volume Part I
Semantics and optimization of the SPARQL 1.1 federation extension

ESWC'11 Proceedings of the 8th extended semantic web conference on The semanic web: research and applications - Volume Part II
Comparing data summaries for processing live queries over Linked Data

World Wide Web
FedX: optimization techniques for federated query processing on linked data

ISWC'11 Proceedings of the 10th international conference on The semantic web - Volume Part I
Enabling fine-grained HTTP caching of SPARQL query results

ISWC'11 Proceedings of the 10th international conference on The semantic web - Volume Part I
FactForge: a fast track to the web of data

Semantic Web
Linked Data and Live Querying for Enabling Support Platforms for Web Dataspaces

ICDEW '12 Proceedings of the 2012 IEEE 28th International Conference on Data Engineering Workshops
Freshening up while staying fast: towards hybrid SPARQL queries

EKAW'12 Proceedings of the 18th international conference on Knowledge Engineering and Knowledge Management
Improving the recall of live linked data querying through reasoning

RR'12 Proceedings of the 6th international conference on Web Reasoning and Rule Systems

Querying Semantic Data on the Web?

ACM SIGMOD Record

Quantified Score

Hi-index	0.00

Visualization

Abstract

For Linked Data query engines, there are inherent trade-offs between centralised approaches that can efficiently answer queries over data cached from parts of the Web, and live decentralised approaches that can provide fresher results over the entire Web at the cost of slower response times. Herein, we propose a hybrid query execution approach that returns fresher results from a broader range of sources vs. the centralised scenario, while speeding up results vs. the live scenario. We first compare results from two public SPARQL stores against current versions of the Linked Data sources they cache; results are often missing or out-of-date. We thus propose using coherence estimates to split a query into a sub-query for which the cached data have good fresh coverage, and a sub-query that should instead be run live. Finally, we evaluate different hybrid query plans and split positions in a real-world setup. Our results show that hybrid query execution can improve freshness vs. fully cached results while reducing the time taken vs. fully live execution.