SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Bigtable: a distributed storage system for structured data
OSDI '06 Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7
RDF-3X: a RISC-style engine for RDF
Proceedings of the VLDB Endowment
Hexastore: sextuple indexing for semantic web data management
Proceedings of the VLDB Endowment
SPARQL basic graph pattern processing with iterative MapReduce
Proceedings of the 2010 Workshop on Massive Data Analytics on the Cloud
Programming Support Innovations for Emerging Distributed Applications
YCSB++: benchmarking and performance debugging advanced features in scalable table stores
Proceedings of the 2nd ACM Symposium on Cloud Computing
Social infobuttons: integrating open health data with social data using semantic technology
Proceedings of the Fifth Workshop on Semantic Web Information Management
Bloofi: a hierarchical Bloom filter index with applications to distributed data provenance
Proceedings of the 2nd International Workshop on Cloud Intelligence
Hi-index | 0.00 |
Resource Description Framework (RDF) was designed with the initial goal of developing metadata for the Internet. While the Internet is a conglomeration of many interconnected networks and computers, most of today's best RDF storage solutions are confined to a single node. Working on a single node has significant scalability issues, especially considering the magnitude of modern day data. In this paper we introduce a scalable RDF data management system that uses Accumulo, a Google Bigtable variant. We introduce storage methods, indexing schemes, and query processing techniques that scale to billions of triples across multiple nodes, while providing fast and easy access to the data through conventional query mechanisms such as SPARQL. Our performance evaluation shows that in most cases, our system outperforms existing distributed RDF solutions, even systems much more complex than ours.