Distributed Semantic Web Data Management in HBase and MySQL Cluster

Authors:
Craig Franke;Samuel Morin;Artem Chebotko;John Abraham;Pearl Brazier
Affiliations:
-;-;-;-;-
Venue:
CLOUD '11 Proceedings of the 2011 IEEE 4th International Conference on Cloud Computing
Year:
2011

Citing 0
Cited 4

Provenance for MapReduce-based data-intensive workflows

Proceedings of the 6th workshop on Workflows in support of large-scale science
Scalable RDF graph querying using cloud computing

Journal of Web Engineering
Efficient data partitioning model for heterogeneous graphs in the cloud

SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Ultrawrap: SPARQL execution on relational data

Web Semantics: Science, Services and Agents on the World Wide Web

Quantified Score

Hi-index	0.00

Visualization

Abstract

Various computing and data resources on the Web are being enhanced with machine-interpretable semantic descriptions to facilitate better search, discovery and integration. This interconnected metadata constitutes the Semantic Web, whose volume can potentially grow the scale of the Web. Efficient management of Semantic Web data, expressed using the W3C's Resource Description Framework (RDF), is crucial for supporting new data-intensive, semantics-enabled applications. In this work, we study and compare two approaches to distributed RDF data management based on emerging cloud computing technologies and traditional relational database clustering technologies. In particular, we design distributed RDF data storage and querying schemes for HBase and MySQL Cluster and conduct an empirical comparison of these approaches on a cluster of commodity machines using datasets and queries from the Third Provenance Challenge and Lehigh University Benchmark. Our study reveals interesting patterns in query evaluation, shows that our algorithms are promising, and suggests that cloud computing has a great potential for scalable Semantic Web data management.