ParallelGDB: a parallel graph database based on cache specialization

Authors:
Luis Barguñó;Victor Muntés-Mulero;David Dominguez-Sal;Patrick Valduriez
Affiliations:
DAMA - UPC, Barcelona;DAMA - UPC, Barcelona;DAMA - UPC, Barcelona;INRIA - LIRMM, France
Venue:
Proceedings of the 15th Symposium on International Database Engineering & Applications
Year:
2011

Citing 12
Cited 1

A bridging model for parallel computation

Communications of the ACM
A Scalable Distributed Parallel Breadth-First Search Algorithm on BlueGene/L

SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Graph mining: Laws, generators, and algorithms

ACM Computing Surveys (CSUR)
Fast and practical indexing and querying of very large graphs

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Tashkent+: memory-aware load balancing and update filtering in replicated databases

Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Dex: high-performance exploration on large graphs for information retrieval

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Graph Twiddling in a MapReduce World

Computing in Science and Engineering
PEGASUS: A Peta-Scale Graph Mining System Implementation and Observations

ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
Pregel: a system for large-scale graph processing

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
The little engine(s) that could: scaling online social networks

Proceedings of the ACM SIGCOMM 2010 conference
Graph partitioning strategies for efficient BFS in shared-nothing parallel systems

WAIM'10 Proceedings of the 2010 international conference on Web-age information management
A discussion on the design of graph database benchmarks

TPCTC'10 Proceedings of the Second TPC technology conference on Performance evaluation, measurement and characterization of complex systems

Analysis of partitioning strategies for graph processing in bulk synchronous parallel models

Proceedings of the fifth international workshop on Cloud data management

Quantified Score

Hi-index	0.00

Visualization

Abstract

The need for managing massive attributed graphs is becoming common in many areas such as recommendation systems, proteomics analysis, social network analysis or bibliographic analysis. This is making it necessary to move towards parallel systems that allow managing graph databases containing millions of vertices and edges. Previous work on distributed graph databases has focused on finding ways to partition the graph to reduce network traffic and improve execution time. However, partitioning a graph and keeping the information regarding the location of vertices might be unrealistic for massive graphs. In this paper, we propose Parallel-GDB, a new system based on specializing the local caches of any node in this system, providing a better cache hit ratio. ParallelGDB uses a random graph partitioning, avoiding complex partition methods based on the graph topology, that usually require managing extra data structures. This proposed system provides an efficient environment for distributed graph databases.