RDF Data-Centric Storage

Authors:
Justin J. Levandoski;Mohamed F. Mokbel
Affiliations:
-;-
Venue:
ICWS '09 Proceedings of the 2009 IEEE International Conference on Web Services
Year:
2009

Citing 0
Cited 8

Relational processing of RDF queries: a survey

ACM SIGMOD Record
Scalable and efficient reasoning for enforcing role-based access control

DBSec'10 Proceedings of the 24th annual IFIP WG 11.3 working conference on Data and applications security and privacy
An experimental evaluation of relational RDF storage and querying techniques

DASFAA'10 Proceedings of the 15th international conference on Database systems for advanced applications
Efficient and adaptable query workload-aware management for RDF data

WISE'10 Proceedings of the 11th international conference on Web information systems engineering
Database foundations for scalable RDF processing

RW'11 Proceedings of the 7th international conference on Reasoning web: semantic technologies for the web of data
Database techniques for linked data management

SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
To nest or not to nest, when and how much: representing intermediate results of graph pattern queries in MapReduce based processing

SWIM '12 Proceedings of the 4th International Workshop on Semantic Web Information Management
Searching the web of data

ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

The vision of the Semantic Web has brought about new challenges at the intersection of web research and data management. One fundamental research issue at this intersection is the storage of the Resource Description Framework (RDF) data: the model at the core of the Semantic Web. We present a data-centric approach for storage of RDF in relational databases. The intuition behind our approach is that each RDF dataset requires a tailored table schema that achieves efficient query processing by (1) reducing the need for joins in the query plan and (2) keeping null storage below a given threshold. Using a basic structure derived from the RDF data, we propose a two-phase algorithm involving clustering and partitioning. The clustering phase aims to reduce the need for joins in a query. The partitioning phase aims to optimize storage of extra (i.e., null) data in the underlying relational database. Our approach does not assume a particular query workload, relevant for RDF knowledge bases with a large number of ad-hoc queries. Extensive experimental evidence using three publicly available real-world RDF data sets (i.e., DBLP, DBPedia, and Uniprot) shows that our schema creation technique provides superior query processing performance compared to state-of-the art storage approaches. Further, our approach is easily implemented, and complements existing RDF-specific databases.