A linear algebra technique for (de)centralized processing of SPARQL queries

  • Authors:
  • Roberto De Virgilio

  • Affiliations:
  • Dipartimento di Informatica e Automazione, Universitá Roma Tre, Rome, Italy

  • Venue:
  • ER'12 Proceedings of the 31st international conference on Conceptual Modeling
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

We are witnessing the evolution of the Web from a worldwide information space of linked documents to a global knowledge base, composed of semantically interconnected resources (to date, 25 billion RDF triples, interlinked by around 395 million RDF links). RDF comes equipped with the SPARQL language for querying data in RDF format. Although many aspects of the challenges faced in large-scale RDF data management have already been studied in the database research community, current approaches provide centralized hard-coded solutions, with high consumption of resources; moreover, these exhibit very limited flexibility dealing with queries, at various levels of granularity and complexity (e.g. so-called non-conjunctive queries that use SPARQL's union or optional). In this paper we propose a general model for answering SPARQL queries based on the first principles of linear algebra, in particular on tensorial calculus. Leveraging our abstract algebraic framework, our technique allows both quick decentralized processing, and centralized massive analysis. Experimental results show that our approach, utilizing recent linear algebra techniques--tailored to performance and accuracy as required in applied mathematics and physics fields--can process analysis efficiently, when compared to competitors.