RDF data management in the Amazon cloud

Authors:
Francesca Bugiotti;François Goasdoué;Zoi Kaoudi;Ioana Manolescu
Affiliations:
Università Roma Tré, Italy;Université Paris-Sud and Inria, Saclay, France;Inria Saclay and Université, Paris-Sud, France;Inria Saclay and Université, Paris-Sud, France
Venue:
Proceedings of the 2012 Joint EDBT/ICDT Workshops
Year:
2012

Citing 13
Cited 3

MapReduce: simplified data processing on large clusters

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Building a database on S3

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Web Semantics in the Clouds

IEEE Intelligent Systems
SP^2Bench: A SPARQL Performance Benchmark

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Scalable join processing on very large RDF graphs

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
The RDF-3X engine for scalable management of RDF data

The VLDB Journal — The International Journal on Very Large Data Bases
SPARQL basic graph pattern processing with iterative MapReduce

Proceedings of the 2010 Workshop on Massive Data Analytics on the Cloud
Data Intensive Query Processing for Large RDF Graphs Using Cloud Computing Tools

CLOUD '10 Proceedings of the 2010 IEEE 3rd International Conference on Cloud Computing
Runtime measurements in the cloud: observing, analyzing, and reducing variance

Proceedings of the VLDB Endowment
Workshop on semantic data management: a summary report

ACM SIGMOD Record
Predicting cost amortization for query services

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
PigSPARQL: mapping SPARQL to Pig Latin

Proceedings of the International Workshop on Semantic Web Information Management
On the Performance Variability of Production Cloud Services

CCGRID '11 Proceedings of the 2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing

DataBridges: data integration for digital cities

Proceedings of the 2012 ACM workshop on City data management workshop
AMADA: web data repositories in the amazon cloud

Proceedings of the 21st ACM international conference on Information and knowledge management
Linked open GeoData management in the cloud

Proceedings of the 2nd International Workshop on Open Data

Quantified Score

Hi-index	0.00

Visualization

Abstract

Cloud computing has been massively adopted recently in many applications for its elastic scaling and fault-tolerance. At the same time, given that the amount of available RDF data sources on the Web increases rapidly, there is a constant need for scalable RDF data management tools. In this paper we propose a novel architecture for the distributed management of RDF data, exploiting an existing commercial cloud infrastructure, namely Amazon Web Services (AWS). We study the problem of indexing RDF data stored within AWS, by using SimpleDB, a key-value store provided by AWS for small data items. The goal of the index is to efficiently identify the RDF datasets which may have answers for a given query, and route the query only to those. We devised and experimented with several indexing strategies; we discuss experimental results and avenues for future work.