RAPID: Enabling Scalable Ad-Hoc Analytics on the Semantic Web
ISWC '09 Proceedings of the 8th International Semantic Web Conference
Towards scalable RDF graph analytics on MapReduce
Proceedings of the 2010 Workshop on Massive Data Analytics on the Cloud
An intermediate algebra for optimizing RDF graph pattern matching on MapReduce
ESWC'11 Proceedings of the 8th extended semantic web conference on The semanic web: research and applications - Volume Part II
TripleCloud: An Infrastructure for Exploratory Querying over Web-Scale RDF Data
WI-IAT '11 Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 03
Large scale fuzzy pD* reasoning using mapreduce
ISWC'11 Proceedings of the 10th international conference on The semantic web - Volume Part I
WebPIE: A Web-scale Parallel Inference Engine using MapReduce
Web Semantics: Science, Services and Agents on the World Wide Web
Adaptive integration of distributed semantic web data
DNIS'10 Proceedings of the 6th international conference on Databases in Networked Information Systems
OWL reasoning with WebPIE: calculating the closure of 100 billion triples
ESWC'10 Proceedings of the 7th international conference on The Semantic Web: research and Applications - Volume Part I
Towards efficient join processing over large RDF graph using mapreduce
SSDBM'12 Proceedings of the 24th international conference on Scientific and Statistical Database Management
Hi-index | 0.00 |
Semantic inferencing and querying across large-scale RDF triple stores is notoriously slow. Our objective is to expedite this process by employing Google's MapReduce framework to implement scale-out distributed querying and reasoning. This approach requires RDF graphs to be decomposed into smaller units that are distributed across computational nodes. RDF Molecules appear to offer an ideal approach – providing an intermediate level of granularity between RDF graphs and triples. However, the original RDF molecule definition has inherent limitations that will adversely affect performance. In this paper, we propose a number of extensions to RDF molecules (hierarchy and ordering) to overcome these limitations. We then present some implementation details for our MapReduce-based RDF molecule store. Finally we evaluate the benefits of our approach in the context of the Bio-MANTA project – an application that requires integration and querying across large-scale protein-protein interaction datasets.