A distributed graph engine for web scale RDF data

Authors:
Kai Zeng;Jiacheng Yang;Haixun Wang;Bin Shao;Zhongyuan Wang
Affiliations:
UCLA;Columbia University;Microsoft Research Asia;Microsoft Research Asia;Microsoft Research Asia and Renmin University of China
Venue:
Proceedings of the VLDB Endowment
Year:
2013

Citing 27
Cited 3

Using Semi-Joins to Solve Relational Queries

Journal of the ACM (JACM)
Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema

ISWC '02 Proceedings of the First International Semantic Web Conference on The Semantic Web
Storing RDF as a Graph

LA-WEB '03 Proceedings of the First Conference on Latin American Web Congress
An efficient SQL-based RDF querying scheme

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Dual Labeling: Answering Graph Reachability Queries in Constant Time

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
SPARQL basic graph pattern optimization using selectivity estimation

Proceedings of the 17th international conference on World Wide Web
Graphs-at-a-time: query language and access methods for graph databases

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
RDF-3X: a RISC-style engine for RDF

Proceedings of the VLDB Endowment
Hexastore: sextuple indexing for semantic web data management

Proceedings of the VLDB Endowment
SW-Store: a vertically partitioned DBMS for Semantic Web data management

The VLDB Journal — The International Journal on Very Large Data Bases
Fast Graph Pattern Matching

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Scalable join processing on very large RDF graphs

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Distance-join: pattern match query in a large graph database

Proceedings of the VLDB Endowment
The RDF-3X engine for scalable management of RDF data

The VLDB Journal — The International Journal on Very Large Data Bases
LUBM: A benchmark for OWL knowledge base systems

Web Semantics: Science, Services and Agents on the World Wide Web
Matrix "Bit" loaded: a scalable lightweight join query processor for RDF data

Proceedings of the 19th international conference on World wide web
YARS2: a federated repository for querying graph structured data from the web

ISWC'07/ASWC'07 Proceedings of the 6th international The semantic web and 2nd Asian conference on Asian semantic web conference
DBpedia: a nucleus for a web of open data

ISWC'07/ASWC'07 Proceedings of the 6th international The semantic web and 2nd Asian conference on Asian semantic web conference
Pregel: a system for large-scale graph processing

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
High-performance, massively scalable distributed systems using the MapReduce software framework: the SHARD triple-store

Programming Support Innovations for Emerging Distributed Applications
Heuristics-Based Query Processing for Large RDF Graphs Using Cloud Computing

IEEE Transactions on Knowledge and Data Engineering
An approach to RDF(S) query, manipulation and inference on databases

WAIM'05 Proceedings of the 6th international conference on Advances in Web-Age Information Management
Querying RDF data from a graph database perspective

ESWC'05 Proceedings of the Second European conference on The Semantic Web: research and Applications
Probase: a probabilistic taxonomy for text understanding

SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Towards effective partition management for large graphs

SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Managing and mining large graphs: systems and implementations

SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Efficient subgraph matching on billion node graphs

Proceedings of the VLDB Endowment

Trinity: a distributed graph engine on a memory cloud

Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Efficient social network data query processing on MapReduce

Proceedings of the 5th ACM workshop on HotPlanet
Database research challenges and opportunities of big graph data

BNCOD'13 Proceedings of the 29th British National conference on Big Data

Quantified Score

Hi-index	0.00

Visualization

Abstract

Much work has been devoted to supporting RDF data. But state-of-the-art systems and methods still cannot handle web scale RDF data effectively. Furthermore, many useful and general purpose graph-based operations (e.g., random walk, reachability, community discovery) on RDF data are not supported, as most existing systems store and index data in particular ways (e.g., as relational tables or as a bitmap matrix) to maximize one particular operation on RDF data: SPARQL query processing. In this paper, we introduce Trinity. RDF, a distributed, memory-based graph engine for web scale RDF data. Instead of managing the RDF data in triple stores or as bitmap matrices, we store RDF data in its native graph form. It achieves much better (sometimes orders of magnitude better) performance for SPARQL queries than the state-of-the-art approaches. Furthermore, since the data is stored in its native graph form, the system can support other operations (e.g., random walks, reachability) on RDF graphs as well. We conduct comprehensive experimental studies on real life, web scale RDF data to demonstrate the effectiveness of our approach.