RDF-4G: algorithmic building blocks for large-scale graph analytics

  • Authors:
  • Stephan Seufert

  • Affiliations:
  • Max Planck Institute for Informatics, Saarbrücken, Germany

  • Venue:
  • Proceedings of the 2013 Sigmod/PODS Ph.D. symposium on PhD symposium
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present RDF-4G, the first three miles towards a large-scale graph-analytics engine built on top of the state-of-the-art RDF engine, RDF-3X. The algorithmic building blocks that make up this work help answering fundamental questions about relationships between entities in a graph-structured world. More precisely, our system provides insights into what we define as the trilogy of relationship analyis: Is there a relationship between entities? Who participates in the connection? How can the relationship be characterized? While the first two questions correspond to the algorithmic primitives of graph processing, reachability and shortest path queries, for answering the third question we propose a novel graph-theoretic concept, relatedness cores. The technical contributions we make in this work are efficient index structures for reachability and shortest path query processing together with a new notion of and algorithms for relationship characterization. The latter can be efficiently computed based on the techniques we have developed in our work on graph indexing. All our methods are integrated into the RDF-3X engine, the state-of-the-art system for querying RDF-structured data. Future work includes the exposure of our algorithmic building blocks to the user, via extensions to the de-facto standard query language for graph-structured data, SPARQL.