Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema
ISWC '02 Proceedings of the First International Semantic Web Conference on The Semantic Web
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Scalable semantic web data management using vertical partitioning
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Pig latin: a not-so-foreign language for data processing
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
RDF-3X: a RISC-style engine for RDF
Proceedings of the VLDB Endowment
SCOPE: easy and efficient parallel processing of massive data sets
Proceedings of the VLDB Endowment
SPIDER: a system for scalable, parallel / distributed evaluation of large-scale RDF data
Proceedings of the 18th ACM conference on Information and knowledge management
Scalable Distributed Reasoning Using MapReduce
ISWC '09 Proceedings of the 8th International Semantic Web Conference
YARS2: a federated repository for querying graph structured data from the web
ISWC'07/ASWC'07 Proceedings of the 6th international The semantic web and 2nd Asian conference on Asian semantic web conference
Data Intensive Query Processing for Large RDF Graphs Using Cloud Computing Tools
CLOUD '10 Proceedings of the 2010 IEEE 3rd International Conference on Cloud Computing
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Partitioned indexes for entity search over RDF knowledge bases
DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part I
Hi-index | 0.00 |
Processing SPARQL queries on single node is obviously not scalable, considering the rapid growth of RDF knowledge bases. This calls for scalable solutions of SPARQL query processing over Web-scale RDF data. There have been attempts for applying SPARQL query processing techniques in MapReduce environments. However, no study has been conducted on finding optimal partitioning and indexing schemes for distributing RDF data in MapReduce. In this paper, we investigate RDF data partitioning technique that provides effective indexing schemes to support efficient SPARQL query processing in MapReduce. Our extensive experiments over a huge real-life RDF dataset show the performance of the proposed partitioning and indexing schemes for efficient SPARQL query processing.