An efficient SQL-based RDF querying scheme

  • Authors:
  • Eugene Inseok Chong;Souripriya Das;George Eadon;Jagannathan Srinivasan

  • Affiliations:
  • Oracle, One Oracle Drive, Nashua, NH;Oracle, One Oracle Drive, Nashua, NH;Oracle, One Oracle Drive, Nashua, NH;Oracle, One Oracle Drive, Nashua, NH

  • Venue:
  • VLDB '05 Proceedings of the 31st international conference on Very large data bases
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Devising a scheme for efficient and scalable querying of Resource Description Framework (RDF) data has been an active area of current research. However, most approaches define new languages for querying RDF data, which has the following shortcomings: 1) They are difficult to integrate with SQL queries used in database applications, and 2) They incur inefficiency as data has to be transformed from SQL to the corresponding language data format. This paper proposes a SQL based scheme that avoids these problems. Specifically, it introduces a SQL table function RDF_MATCH to query RDF data. The results of RDF_MATCH table function can be further processed by SQL's rich querying capabilities and seamlessly combined with queries on traditional relational data. Furthermore, the RDF_MATCH table function invocation is rewritten as a SQL query, thereby avoiding run-time table function procedural overheads. It also enables optimization of rewritten query in conjunction with the rest of the query. The resulting query is executed efficiently by making use of B-tree indexes as well as specialized subject-property materialized views. This paper describes the functionality of the RDF_MATCH table function for querying RDF data, which can optionally include user-defined rulebases, and discusses its implementation in Oracle RDBMS. It also presents an experimental study characterizing the overhead eliminated by avoiding procedural code at runtime, characterizing performance under various input conditions, and demonstrating scalability using 80 million RDF triples from UniProt protein and annotation data.