Efficient management of transitive relationships in large data and knowledge bases
SIGMOD '89 Proceedings of the 1989 ACM SIGMOD international conference on Management of data
Scalable join processing on very large RDF graphs
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Counting beyond a Yottabyte, or how SPARQL 1.1 property paths will prevent adoption of the standard
Proceedings of the 21st international conference on World Wide Web
The complexity of evaluating path expressions in SPARQL
PODS '12 Proceedings of the 31st symposium on Principles of Database Systems
YAGO2: A spatially and temporally enhanced knowledge base from Wikipedia
Artificial Intelligence
Querying Semantic Data on the Web?
ACM SIGMOD Record
Hi-index | 0.00 |
As Semantic Web efforts continue to gather steam, the RDF engines are faced with graphs with millions of nodes and billions of edges. While much recent work in addressing the resulting scalability issues in processing queries over these datasets have mainly considered SPARQL 1.0, the next-generation query language recommendations have proposed the addition of regular expression restricted navigation queries into SPARQL. We address the problem of supporting efficient processing of property paths into RDF-3X -- a high-performance RDF engine. In this paper, we restrict our attention to a restricted definition of property paths that is not only tractable but also most commonly used -- instead of enumerating all paths that satisfy the given query, we focus on regular expression based reachability queries. Based on this, we make the following three major technical contributions: first, we present a detailed account of integrating the recently proposed highly compact reachability index called FERRARI into the RDF-3X engine to support property path evaluation; second, we show how property path queries can be efficiently answered using multiple instances of this index -- one instance for each distinct label in the graph; and finally, we develop a set of queries over real-world RDF data that can serve as benchmark set for evaluating the efficiency of property path queries. Our experimental results over Yago2, a large RDF-based knowledge base, show that our proposed approach is highly scalable and flexible.