ANAPSID: an adaptive query processing engine for SPARQL endpoints

Authors:
Maribel Acosta;Maria-Esther Vidal;Tomas Lampo;Julio Castillo;Edna Ruckhaus
Affiliations:
Universidad Simón Bolívar, Caracas, Venezuela;Universidad Simón Bolívar, Caracas, Venezuela;University of Maryland, College Park;Universidad Simón Bolívar, Caracas, Venezuela;Universidad Simón Bolívar, Caracas, Venezuela
Venue:
ISWC'11 Proceedings of the 10th international conference on The semantic web - Volume Part I
Year:
2011

Citing 22
Cited 5

Mediators in the Architecture of Future Information Systems

Computer
Cost-based query scrambling for initial delays

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Query optimization in the presence of limited access patterns

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Adaptive query processing

Foundations and Trends in Databases
SPARQL basic graph pattern optimization using selectivity estimation

Proceedings of the 17th international conference on World Wide Web
Hexastore: sextuple indexing for semantic web data management

Proceedings of the VLDB Endowment
Self-organizing tuple reconstruction in column-stores

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Scalable join processing on very large RDF graphs

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Executing SPARQL Queries over the Web of Linked Data

ISWC '09 Proceedings of the 8th International Semantic Web Conference
Matrix "Bit" loaded: a scalable lightweight join query processor for RDF data

Proceedings of the 19th international conference on World wide web
Data summaries for on-demand queries over linked data

Proceedings of the 19th international conference on World wide web
Querying distributed RDF data sources with SPARQL

ESWC'08 Proceedings of the 5th European semantic web conference on The semantic web: research and applications
An expressive and efficient solution to the service selection problem

ISWC'10 Proceedings of the 9th international semantic web conference on The semantic web - Volume Part I
SPARQL query optimization on top of DHTs

ISWC'10 Proceedings of the 9th international semantic web conference on The semantic web - Volume Part I
Linked data query processing strategies

ISWC'10 Proceedings of the 9th international semantic web conference on The semantic web - Volume Part I
Using reformulation trees to optimize queries over distributed heterogeneous sources

ISWC'10 Proceedings of the 9th international semantic web conference on The semantic web - Volume Part I
Summary models for routing keywords to linked data sources

ISWC'10 Proceedings of the 9th international semantic web conference on The semantic web - Volume Part I
A sampling-based approach to identify QoS for web service orchestrations

Proceedings of the 12th International Conference on Information Integration and Web-based Applications & Services
SIHJoin: querying remote and local linked data

ESWC'11 Proceedings of the 8th extended semantic web conference on The semantic web: research and applications - Volume Part I
Zero-knowledge query planning for an iterator implementation of link traversal based query execution

ESWC'11 Proceedings of the 8th extended semantic web conference on The semantic web: research and applications - Volume Part I
Semantics and optimization of the SPARQL 1.1 federation extension

ESWC'11 Proceedings of the 8th extended semantic web conference on The semanic web: research and applications - Volume Part II
Efficiently joining group patterns in SPARQL queries

ESWC'10 Proceedings of the 7th international conference on The Semantic Web: research and Applications - Volume Part I

CAREY: ClimAtological contRol of EmergencY regions

OTM'11 Proceedings of the 2011th Confederated international conference on On the move to meaningful internet systems
PAnG: finding patterns in annotation graphs

SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
SPLODGE: systematic generation of SPARQL benchmark queries for linked open data

ISWC'12 Proceedings of the 11th international conference on The Semantic Web - Volume Part I
Benchmarking federated SPARQL query engines: are existing testbeds enough?

ISWC'12 Proceedings of the 11th international conference on The Semantic Web - Volume Part II
Federating queries in SPARQL 1.1: Syntax, semantics and evaluation

Web Semantics: Science, Services and Agents on the World Wide Web

Quantified Score

Hi-index	0.00

Visualization

Abstract

Following the design rules of Linked Data, the number of available SPARQL endpoints that support remote query processing is quickly growing; however, because of the lack of adaptivity, query executions may frequently be unsuccessful. First, fixed plans identified following the traditional optimize-thenexecute paradigm, may timeout as a consequence of endpoint availability. Second, because blocking operators are usually implemented, endpoint query engines are not able to incrementally produce results, and may become blocked if data sources stop sending data. We present ANAPSID, an adaptive query engine for SPARQL endpoints that adapts query execution schedulers to data availability and run-time conditions. ANAPSID provides physical SPARQL operators that detect when a source becomes blocked or data traffic is bursty, and opportunistically, the operators produce results as quickly as data arrives from the sources. Additionally, ANAPSID operators implement main memory replacement policies to move previously computed matches to secondary memory avoiding duplicates. We compared ANAPSID performance with respect to RDF stores and endpoints, and observed that ANAPSID speeds up execution time, in some cases, in more than one order of magnitude.