Jena: implementing the semantic web recommendations
Proceedings of the 13th international World Wide Web conference on Alternate track papers & posters
MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
RDF-3X: a RISC-style engine for RDF
Proceedings of the VLDB Endowment
Hexastore: sextuple indexing for semantic web data management
Proceedings of the VLDB Endowment
SW-Store: a vertically partitioned DBMS for Semantic Web data management
The VLDB Journal — The International Journal on Very Large Data Bases
SP^2Bench: A SPARQL Performance Benchmark
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
LUBM: A benchmark for OWL knowledge base systems
Web Semantics: Science, Services and Agents on the World Wide Web
Data summaries for on-demand queries over linked data
Proceedings of the 19th international conference on World wide web
An evaluation of approaches to federated query processing over linked data
Proceedings of the 6th International Conference on Semantic Systems
Heuristics-Based Query Processing for Large RDF Graphs Using Cloud Computing
IEEE Transactions on Knowledge and Data Engineering
OWLIM – a pragmatic semantic repository for OWL
WISE'05 Proceedings of the 2005 international conference on Web Information Systems Engineering
Automatic scaling of selective SPARQL joins using the TIRAMOLA system
SWIM '12 Proceedings of the 4th International Workshop on Semantic Web Information Management
Towards big linked data: a large-scale, distributed semantic data storage
Proceedings of the 14th International Conference on Information Integration and Web-based Applications & Services
Demonstrating intelligent crawling and archiving of web applications
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Semantic-based QoS management in cloud systems: Current status and future challenges
Future Generation Computer Systems
Hi-index | 0.00 |
In this work we present H2RDF, a fully distributed RDF store that combines the MapReduce processing framework with a NoSQL distributed data store. Our system features two unique characteristics that enable efficient processing of both simple and multi-join SPARQL queries on virtually unlimited number of triples: Join algorithms that execute joins according to query selectivity to reduce processing; and adaptive choice among centralized and distributed (MapReduce-based) join execution for fast query responses. Our system efficiently answers both simple joins and complex multivariate queries and easily scales to 3 billion triples using a small cluster of 9 worker nodes. H2RDF outperforms state-of-the-art distributed solutions in multi-join and nonselective queries while achieving comparable performance to centralized solutions in selective queries. In this demonstration we showcase the system's functionality through an interactive GUI. Users will be able to execute predefined or custom-made SPARQL queries on datasets of different sizes, using different join algorithms. Moreover, they can repeat all queries utilizing a different number of cluster resources. Using real-time cluster monitoring and detailed statistics, participants will be able to understand the advantages of different execution schemes versus the input data as well as the scalability properties of H2RDF over both the data size and the available worker resources.