MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Pig latin: a not-so-foreign language for data processing
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
IEEE Intelligent Systems
RDF-3X: a RISC-style engine for RDF
Proceedings of the VLDB Endowment
Hexastore: sextuple indexing for semantic web data management
Proceedings of the VLDB Endowment
MapReduce for Data Intensive Scientific Analyses
ESCIENCE '08 Proceedings of the 2008 Fourth IEEE International Conference on eScience
A comparison of approaches to large-scale data analysis
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
MapReduce and parallel DBMSs: friends or foes?
Communications of the ACM - Amir Pnueli: Ahead of His Time
MapReduce: a flexible data processing tool
Communications of the ACM - Amir Pnueli: Ahead of His Time
SPIDER: a system for scalable, parallel / distributed evaluation of large-scale RDF data
Proceedings of the 18th ACM conference on Information and knowledge management
Hive: a warehousing solution over a map-reduce framework
Proceedings of the VLDB Endowment
LUBM: A benchmark for OWL knowledge base systems
Web Semantics: Science, Services and Agents on the World Wide Web
PigSPARQL: mapping SPARQL to Pig Latin
Proceedings of the International Workshop on Semantic Web Information Management
Matrix chain multiplication via multi-way join algorithms in MapReduce
Proceedings of the 6th International Conference on Ubiquitous Information Management and Communication
RDFPath: path query processing on large RDF graphs with mapreduce
ESWC'11 Proceedings of the 8th international conference on The Semantic Web
RDF data management in the Amazon cloud
Proceedings of the 2012 Joint EDBT/ICDT Workshops
Rya: a scalable RDF triple store for the clouds
Proceedings of the 1st International Workshop on Cloud Intelligence
Towards efficient join processing over large RDF graph using mapreduce
SSDBM'12 Proceedings of the 24th international conference on Scientific and Statistical Database Management
HadoopRDF: a scalable semantic data analytical engine
ICIC'12 Proceedings of the 8th international conference on Intelligent Computing Theories and Applications
ER'12 Proceedings of the 2012 international conference on Advances in Conceptual Modeling
Lightweight semantics over web information systems content employing knowledge tags
ER'12 Proceedings of the 2012 international conference on Advances in Conceptual Modeling
Scalable RDF graph querying using cloud computing
Journal of Web Engineering
Efficient social network data query processing on MapReduce
Proceedings of the 5th ACM workshop on HotPlanet
The family of mapreduce and large-scale data processing systems
ACM Computing Surveys (CSUR)
Exploiting inter-operation parallelism for matrix chain multiplication using MapReduce
The Journal of Supercomputing
Hi-index | 0.00 |
There have been a number of approaches to adopt the RDF data model and the MapReduce framework for a data warehouse, as the data model is suitable for data integration and the data processing framework is good for large-scale fault-tolerant data analyses. Nevertheless, most approaches consider the data model and the framework separately. It has been difficult to create synergy because there have been only a few algorithms which connects the data model and the framework. In this paper, we offer a general and efficient MapReduce algorithm for SPARQL Basic Graph Pattern which is a set of triple patterns to be joined. In a MapReduce world, it is known that the join operation requires computationally expensive MapReduce iterations. For this reason, we minimize the number of iterations with the followings. First, we adopt traditional multi-way join into MapReduce instead of multiple individual joins. Second, by analyzing a given query, we select a good join-key to avoid unnecessary iterations. As a result, the algorithm shows good performance and scalability in terms of time and data size.