MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
IEEE Intelligent Systems
SP^2Bench: A SPARQL Performance Benchmark
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Scalable join processing on very large RDF graphs
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
The RDF-3X engine for scalable management of RDF data
The VLDB Journal — The International Journal on Very Large Data Bases
SPARQL basic graph pattern processing with iterative MapReduce
Proceedings of the 2010 Workshop on Massive Data Analytics on the Cloud
Data Intensive Query Processing for Large RDF Graphs Using Cloud Computing Tools
CLOUD '10 Proceedings of the 2010 IEEE 3rd International Conference on Cloud Computing
Runtime measurements in the cloud: observing, analyzing, and reducing variance
Proceedings of the VLDB Endowment
Workshop on semantic data management: a summary report
ACM SIGMOD Record
Predicting cost amortization for query services
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
PigSPARQL: mapping SPARQL to Pig Latin
Proceedings of the International Workshop on Semantic Web Information Management
On the Performance Variability of Production Cloud Services
CCGRID '11 Proceedings of the 2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing
DataBridges: data integration for digital cities
Proceedings of the 2012 ACM workshop on City data management workshop
AMADA: web data repositories in the amazon cloud
Proceedings of the 21st ACM international conference on Information and knowledge management
Linked open GeoData management in the cloud
Proceedings of the 2nd International Workshop on Open Data
Hi-index | 0.00 |
Cloud computing has been massively adopted recently in many applications for its elastic scaling and fault-tolerance. At the same time, given that the amount of available RDF data sources on the Web increases rapidly, there is a constant need for scalable RDF data management tools. In this paper we propose a novel architecture for the distributed management of RDF data, exploiting an existing commercial cloud infrastructure, namely Amazon Web Services (AWS). We study the problem of indexing RDF data stored within AWS, by using SimpleDB, a key-value store provided by AWS for small data items. The goal of the index is to efficiently identify the RDF datasets which may have answers for a given query, and route the query only to those. We devised and experimented with several indexing strategies; we discuss experimental results and avenues for future work.