Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema
ISWC '02 Proceedings of the First International Semantic Web Conference on The Semantic Web
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Scalable semantic web data management using vertical partitioning
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
SPARQL basic graph pattern optimization using selectivity estimation
Proceedings of the 17th international conference on World Wide Web
Pig latin: a not-so-foreign language for data processing
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
The SPARQL Query Graph Model for Query Optimization
ESWC '07 Proceedings of the 4th European conference on The Semantic Web: Research and Applications
IEEE Intelligent Systems
RDF-3X: a RISC-style engine for RDF
Proceedings of the VLDB Endowment
SP^2Bench: A SPARQL Performance Benchmark
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
A comparison of approaches to large-scale data analysis
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Semantics and complexity of SPARQL
ACM Transactions on Database Systems (TODS)
SPIDER: a system for scalable, parallel / distributed evaluation of large-scale RDF data
Proceedings of the 18th ACM conference on Information and knowledge management
Building a high-level dataflow system on top of Map-Reduce: the Pig experience
Proceedings of the VLDB Endowment
LUBM: A benchmark for OWL knowledge base systems
Web Semantics: Science, Services and Agents on the World Wide Web
Towards scalable RDF graph analytics on MapReduce
Proceedings of the 2010 Workshop on Massive Data Analytics on the Cloud
SPARQL basic graph pattern processing with iterative MapReduce
Proceedings of the 2010 Workshop on Massive Data Analytics on the Cloud
Foundations of SPARQL query optimization
Proceedings of the 13th International Conference on Database Theory
Data Intensive Query Processing for Large RDF Graphs Using Cloud Computing Tools
CLOUD '10 Proceedings of the 2010 IEEE 3rd International Conference on Cloud Computing
Data-Intensive Text Processing with MapReduce
Data-Intensive Text Processing with MapReduce
RDFPath: path query processing on large RDF graphs with mapreduce
ESWC'11 Proceedings of the 8th international conference on The Semantic Web
RDF data management in the Amazon cloud
Proceedings of the 2012 Joint EDBT/ICDT Workshops
Robust runtime optimization and skew-resistant execution of analytical SPARQL queries on pig
ISWC'12 Proceedings of the 11th international conference on The Semantic Web - Volume Part I
The family of mapreduce and large-scale data processing systems
ACM Computing Surveys (CSUR)
Toward a data scalable solution for facilitating discovery of scientific data resources
DISCS-2013 Proceedings of the 2013 International Workshop on Data-Intensive Scalable Computing Systems
Semantic-based QoS management in cloud systems: Current status and future challenges
Future Generation Computer Systems
Hi-index | 0.00 |
In this paper we investigate the scalable processing of complex SPARQL queries on very large RDF datasets. As underlying platform we use Apache Hadoop, an open source implementation of Google's MapReduce for massively parallelized computations on a computer cluster. We introduce PigSPARQL, a system which gives us the opportunity to process complex SPARQL queries on a MapReduce cluster. To this end, SPARQL queries are translated into Pig Latin, a data analysis language developed by Yahoo! Research. Pig Latin programs are executed by a series of MapReduce jobs on a Hadoop cluster. We evaluate the processing of SPARQL queries by means of PigSPARQL using the SP2Bench, a SPARQL specific performance benchmark and demonstrate that PigSPARQL enables a scalable execution of SPARQL queries based on Hadoop without any additional programming efforts.