Data summaries for on-demand queries over linked data

Authors:
Andreas Harth;Katja Hose;Marcel Karnstedt;Axel Polleres;Kai-Uwe Sattler;Jürgen Umbrich
Affiliations:
Karlsruhe Institute of Technology, Karlsruhe, Germany;Max-Planck Institute for Informatics, Saarbruecken, Germany;National University of Ireland, Galway, Galway, Ireland;National University of Ireland, Galway, Galway, Ireland;Ilmenau University of Technology, Illmenau, Germany;National University of Ireland, Galway, Galway, Ireland
Venue:
Proceedings of the 19th international conference on World wide web
Year:
2010

Citing 19
Cited 32

A federated architecture for information management

ACM Transactions on Information Systems (TOIS)
Measuring index quality using random walks on the Web

WWW '99 Proceedings of the eighth international conference on World Wide Web
The state of the art in distributed query processing

ACM Computing Surveys (CSUR)
R-trees: a dynamic index structure for spatial searching

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Routing Indices For Peer-to-Peer Systems

ICDCS '02 Proceedings of the 22 nd International Conference on Distributed Computing Systems (ICDCS'02)
Index structures and algorithms for querying distributed RDF repositories

Proceedings of the 13th international conference on World Wide Web
Optimized Index Structures for Querying RDF from the Web

LA-WEB '05 Proceedings of the Third Latin American Web Congress
Tree Vector Indexes: Efficient Range Queries for Dynamic Content on Peer-to-Peer Networks

PDP '06 Proceedings of the 14th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing
Distributed Data Summaries for Approximate Query Processing in PDMS

IDEAS '06 Proceedings of the 10th International Database Engineering and Applications Symposium
Towards a scalable search and query engine for the web

Proceedings of the 16th international conference on World Wide Web
The history of histograms (abridged)

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
RDF-3X: a RISC-style engine for RDF

Proceedings of the VLDB Endowment
Sindice.com: a document-oriented lookup index for open linked data

International Journal of Metadata, Semantics and Ontologies
RDFStats - An Extensible RDF Statistics Generator and Library

DEXA '09 Proceedings of the 2009 20th International Workshop on Database and Expert Systems Application
Processing rank-aware queries in P2P systems

DBISP2P'05/06 Proceedings of the 2005/2006 international conference on Databases, information systems, and peer-to-peer computing
Querying distributed RDF data sources with SPARQL

ESWC'08 Proceedings of the 5th European semantic web conference on The semantic web: research and applications
On using histograms as routing indexes in peer-to-peer systems

DBISP2P'04 Proceedings of the Second international conference on Databases, Information Systems, and Peer-to-Peer Computing
On constructing small worlds in unstructured peer-to-peer systems

EDBT'04 Proceedings of the 2004 international conference on Current Trends in Database Technology

An evaluation of approaches to federated query processing over linked data

Proceedings of the 6th International Conference on Semantic Systems
Linked data query processing strategies

ISWC'10 Proceedings of the 9th international semantic web conference on The semantic web - Volume Part I
Summary models for routing keywords to linked data sources

ISWC'10 Proceedings of the 9th international semantic web conference on The semantic web - Volume Part I
Efficient querying of distributed linked data

Proceedings of the 2011 Joint EDBT/ICDT Ph.D. Workshop
Peer-to-peer web search: euphoria, achievements, disillusionment, and future opportunities

From active data management to event-based systems and more
Linked data metrics for flexible expert search on the open web

ESWC'11 Proceedings of the 8th extended semantic web conference on The semantic web: research and applications - Volume Part I
SIHJoin: querying remote and local linked data

ESWC'11 Proceedings of the 8th extended semantic web conference on The semantic web: research and applications - Volume Part I
Zero-knowledge query planning for an iterator implementation of link traversal based query execution

ESWC'11 Proceedings of the 8th extended semantic web conference on The semantic web: research and applications - Volume Part I
Database foundations for scalable RDF processing

RW'11 Proceedings of the 7th international conference on Reasoning web: semantic technologies for the web of data
Comparing data summaries for processing live queries over Linked Data

World Wide Web
ANAPSID: an adaptive query processing engine for SPARQL endpoints

ISWC'11 Proceedings of the 10th international conference on The semantic web - Volume Part I
FedBench: a benchmark suite for federated semantic data query processing

ISWC'11 Proceedings of the 10th international conference on The semantic web - Volume Part I
FedX: optimization techniques for federated query processing on linked data

ISWC'11 Proceedings of the 10th international conference on The semantic web - Volume Part I
Linked data indexing methods: a survey

OTM'11 Proceedings of the 2011th Confederated international conference on On the move to meaningful internet systems
Searching and browsing Linked Data with SWSE: The Semantic Web Search Engine

Web Semantics: Science, Services and Agents on the World Wide Web
Scalable distributed indexing and query processing over Linked Data

Web Semantics: Science, Services and Agents on the World Wide Web
Semantic navigation on the web of data: specification of routes, web fragments and actions

Proceedings of the 21st international conference on World Wide Web
H2RDF: adaptive query processing on RDF data in the cloud.

Proceedings of the 21st international conference companion on World Wide Web
Database techniques for linked data management

SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Towards benefit-based RDF source selection for SPARQL queries

SWIM '12 Proceedings of the 4th International Workshop on Semantic Web Information Management
Efficient distributed query processing for autonomous RDF databases

Proceedings of the 15th International Conference on Extending Database Technology
SPARQL for a web of linked data: semantics and computability

ESWC'12 Proceedings of the 9th international conference on The Semantic Web: research and applications
Top-k linked data query processing

ESWC'12 Proceedings of the 9th international conference on The Semantic Web: research and applications
Sharing statistics for SPARQL federation optimization, with emphasis on benchmark quality

ESWC'12 Proceedings of the 9th international conference on The Semantic Web: research and applications
Enhancing source selection for live queries over linked data via query log mining

JIST'11 Proceedings of the 2011 joint international conference on The Semantic Web
SchemEX - Efficient construction of a data catalogue by stream-based indexing of linked data

Web Semantics: Science, Services and Agents on the World Wide Web
A DHT-Based system for the management of loosely structured, multidimensional data

Transactions on Large-Scale Data- and Knowledge-Centered Systems VI
SPLODGE: systematic generation of SPARQL benchmark queries for linked open data

ISWC'12 Proceedings of the 11th international conference on The Semantic Web - Volume Part I
Querying Semantic Data on the Web?

ACM SIGMOD Record
Discovery querying in linked open data

Proceedings of the Joint EDBT/ICDT 2013 Workshops
Structure inference for linked data sources using clustering

Proceedings of the Joint EDBT/ICDT 2013 Workshops
Improving the real-time performance of heterogeneous extremely large datasets

Proceedings of the 17th Panhellenic Conference on Informatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Typical approaches for querying structured Web Data collect (crawl) and pre-process (index) large amounts of data in a central data repository before allowing for query answering. However, this time-consuming pre-processing phase however leverages the benefits of Linked Data -- where structured data is accessible live and up-to-date at distributed Web resources that may change constantly -- only to a limited degree, as query results can never be current. An ideal query answering system for Linked Data should return current answers in a reasonable amount of time, even on corpora as large as the Web. Query processors evaluating queries directly on the live sources require knowledge of the contents of data sources. In this paper, we develop and evaluate an approximate index structure summarising graph-structured content of sources adhering to Linked Data principles, provide an algorithm for answering conjunctive queries over Linked Data on theWeb exploiting the source summary, and evaluate the system using synthetically generated queries. The experimental results show that our lightweight index structure enables complete and up-to-date query results over Linked Data, while keeping the overhead for querying low and providing a satisfying source ranking at no additional cost.