The state of the art in distributed query processing
ACM Computing Surveys (CSUR)
STR: A Simple and Efficient Algorithm for R-Tree Packing
ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
Integrating Semi-Join-Reducers into State of the Art Query Processors
Proceedings of the 17th International Conference on Data Engineering
Index structures and algorithms for querying distributed RDF repositories
Proceedings of the 13th international conference on World Wide Web
Scalable semantic web data management using vertical partitioning
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Bio2RDF: Towards a mashup to build bioinformatics knowledge systems
Journal of Biomedical Informatics
Column-store support for RDF data management: not all swans are white
Proceedings of the VLDB Endowment
Executing SPARQL Queries over the Web of Linked Data
ISWC '09 Proceedings of the 8th International Semantic Web Conference
The RDF-3X engine for scalable management of RDF data
The VLDB Journal — The International Journal on Very Large Data Bases
Data summaries for on-demand queries over linked data
Proceedings of the 19th international conference on World wide web
A semantic web middleware for virtual data integration on the web
ESWC'08 Proceedings of the 5th European semantic web conference on The semantic web: research and applications
Querying distributed RDF data sources with SPARQL
ESWC'08 Proceedings of the 5th European semantic web conference on The semantic web: research and applications
Linked data query processing strategies
ISWC'10 Proceedings of the 9th international semantic web conference on The semantic web - Volume Part I
Using reformulation trees to optimize queries over distributed heterogeneous sources
ISWC'10 Proceedings of the 9th international semantic web conference on The semantic web - Volume Part I
gStore: answering SPARQL queries via subgraph matching
Proceedings of the VLDB Endowment
A new, highly efficient, and easy to implement top-down join enumeration algorithm
ICDE '11 Proceedings of the 2011 IEEE 27th International Conference on Data Engineering
Characteristic sets: Accurate cardinality estimation for RDF queries with multiple joins
ICDE '11 Proceedings of the 2011 IEEE 27th International Conference on Data Engineering
Structure inference for linked data sources using clustering
Proceedings of the Joint EDBT/ICDT 2013 Workshops
Hi-index | 0.00 |
The inherent flexibility of the RDF data model has led to its notable adoption in many domains, especially in the area of life-sciences. Some of these domains have an emerging need to access data integrated from various distributed sources of information. It is not always possible to implement this by simply loading all data into one central RDF store. For example, in the context of inter-institutional collaboration for drug development and clinical research participants often want to maintain control over their local databases. Alternatively, distributed query processing techniques can be utilized to evaluate queries by accessing the remote data sources only on demand and in conformance with local authorization models. In this paper we present an efficient approach to distributed query processing for large autonomous RDF databases. The groundwork is laid by a comprehensive RDF-specific schema- and instance-level synopsis. We present an optimizer that is able to utilize this synopsis to generate compact execution plans by precisely determining, at compile-time, those sources that are relevant to a query. Furthermore we present a tightly integrated query engine that is able to further reduce the volume of intermediate results at run-time. An extensive evaluation shows that our approach improves query execution times by up to two and transferred data volumes by up to three orders of magnitude compared to a naïve implementation.