A performance evaluation of open source graph databases

  • Authors:
  • Robert Campbell McColl;David Ediger;Jason Poovey;Dan Campbell;David A. Bader

  • Affiliations:
  • Georgia Tech / GTRI, Atlanta, GA, USA;Georgia Tech Research Institute, Atlanta, GA, USA;Georgia Tech Research Institute, Atlanta, GA, USA;Georgia Tech Research Institute, Atlanta, GA, USA;Georgia Institute of Technology, Atlanta, GA, USA

  • Venue:
  • Proceedings of the first workshop on Parallel programming for analytics applications
  • Year:
  • 2014

Quantified Score

Hi-index 0.00

Visualization

Abstract

With the proliferation of large, irregular, and sparse relational datasets, new storage and analysis platforms have arisen to fill gaps in performance and capability left by conventional approaches built on traditional database technologies and query languages. Many of these platforms apply graph structures and analysis techniques to enable users to ingest, update, query, and compute on the topological structure of the network represented as sets of edges relating sets of vertices. To store and process Facebook-scale datasets, software and algorithms must be able to support data sources with billions of edges, update rates of millions of updates per second, and complex analysis kernels. These platforms must provide intuitive interfaces that enable graph experts and novice programmers to write implementations of common graph algorithms. In this paper, we conduct a qualitative study and a performance comparison of 12 open source graph databases using four fundamental graph algorithms on networks containing up to 256 million edges.