SPARQL Endpoint Metrics for Quality-Aware Linked Data Consumption

Authors:
Johannes Lorey
Affiliations:
Hasso Plattner Institute, Potsdam, Germany
Venue:
Proceedings of International Conference on Information Integration and Web-based Applications & Services
Year:
2013

Citing 9
Cited 0

SPARQL basic graph pattern optimization using selectivity estimation

Proceedings of the 17th international conference on World Wide Web
SP^2Bench: A SPARQL Performance Benchmark

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Semantics and complexity of SPARQL

ACM Transactions on Database Systems (TODS)
Querying distributed RDF data sources with SPARQL

ESWC'08 Proceedings of the 5th European semantic web conference on The semantic web: research and applications
FedX: a federation layer for distributed query processing on linked open data

ESWC'11 Proceedings of the 8th extended semantic web conference on The semanic web: research and applications - Volume Part II
DBpedia SPARQL benchmark: performance assessment with real queries on real data

ISWC'11 Proceedings of the 10th international conference on The semantic web - Volume Part I
FedBench: a benchmark suite for federated semantic data query processing

ISWC'11 Proceedings of the 10th international conference on The semantic web - Volume Part I
Bandwidth estimation: metrics, measurement techniques, and tools

IEEE Network: The Magazine of Global Internetworking
SPLODGE: systematic generation of SPARQL benchmark queries for linked open data

ISWC'12 Proceedings of the 11th international conference on The Semantic Web - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

In recent years, dozens of publicly accessible Linked Data repositories containing vast amounts of knowledge presented in the Resource Description Framework (RDF) format have been set up worldwide. By utilizing the SPARQL query language, users can consume, integrate, and present data from a federation of sources for different application scenarios. However, several challenges arise for distributed query processing across multiple SPARQL endpoints, such as devising suitable query optimization or result caching strategies. For implementing these techniques, one crucial aspect lies in determining appropriate endpoint features. In this work, we introduce several metrics that enable universal and finegrained characterization of arbitrary Linked Data repositories. We present comprehensive approaches for deriving these metrics and validate them through extensive evaluation on real-world SPARQL endpoints. Finally, we discuss possible implications of our findings for data consumers.