Lazy release consistency for software distributed shared memory
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
A multilevel algorithm for partitioning graphs
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Size-estimation framework with applications to transitive closure and reachability
Journal of Computer and System Sciences
Parallel dynamic graph partitioning for adaptive unstructured meshes
Journal of Parallel and Distributed Computing - Special issue on dynamic load balancing
Geometric Mesh Partitioning: Implementation and Experiments
SIAM Journal on Scientific Computing
A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs
SIAM Journal on Scientific Computing
Graph partitioning models for parallel computing
Parallel Computing - Special issue on graph partioning and parallel computing
The state of the art in distributed query processing
ACM Computing Surveys (CSUR)
Chord: A scalable peer-to-peer lookup service for internet applications
Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
Distributed query evaluation on semistructured data
ACM Transactions on Database Systems (TODS)
HPCN Europe 1996 Proceedings of the International Conference and Exhibition on High-Performance Computing and Networking
Similarity Search in High Dimensions via Hashing
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema
ISWC '02 Proceedings of the First International Semantic Web Conference on The Semantic Web
Scaling personalized web search
WWW '03 Proceedings of the 12th international conference on World Wide Web
The webgraph framework I: compression techniques
Proceedings of the 13th international conference on World Wide Web
The Structure and Dynamics of Networks: (Princeton Studies in Complexity)
The Structure and Dynamics of Networks: (Princeton Studies in Complexity)
Using partial evaluation in distributed query evaluation
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Scalable semantic web data management using vertical partitioning
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
SP^2Bench: A SPARQL Performance Benchmark
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Proceedings of the 20th ACM conference on Hypertext and hypermedia
PEGASUS: A Peta-Scale Graph Mining System Implementation and Observations
ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
Stateful bulk processing for incremental analytics
Proceedings of the 1st ACM symposium on Cloud computing
Pregel: a system for large-scale graph processing
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
The little engine(s) that could: scaling online social networks
Proceedings of the ACM SIGCOMM 2010 conference
COSI: Cloud Oriented Subgraph Identification in Massive Social Networks
ASONAM '10 Proceedings of the 2010 International Conference on Advances in Social Networks Analysis and Mining
Schism: a workload-driven approach to database replication and partitioning
Proceedings of the VLDB Endowment
Querying semantic web data with SPARQL
Proceedings of the thirtieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Zephyr: live migration in shared nothing databases for elastic cloud platforms
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Linked Data
A parallel graph partitioning algorithm to speed up the large-scale distributed graph mining
Proceedings of the 1st International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications
GPS: a graph processing system
Proceedings of the 25th International Conference on Scientific and Statistical Database Management
A distributed graph engine for web scale RDF data
Proceedings of the VLDB Endowment
Analysis of partitioning strategies for graph processing in bulk synchronous parallel models
Proceedings of the fifth international workshop on Cloud data management
Strong simulation: Capturing topology in graph pattern matching
ACM Transactions on Database Systems (TODS)
Hi-index | 0.00 |
Searching and mining large graphs today is critical to a variety of application domains, ranging from community detection in social networks, to de novo genome sequence assembly. Scalable processing of large graphs requires careful partitioning and distribution of graphs across clusters. In this paper, we investigate the problem of managing large-scale graphs in clusters and study access characteristics of local graph queries such as breadth-first search, random walk, and SPARQL queries, which are popular in real applications. These queries exhibit strong access locality, and therefore require specific data partitioning strategies. In this work, we propose a Self Evolving Distributed Graph Management Environment (Sedge), to minimize inter-machine communication during graph query processing in multiple machines. In order to improve query response time and throughput, Sedge introduces a two-level partition management architecture with complimentary primary partitions and dynamic secondary partitions. These two kinds of partitions are able to adapt in real time to changes in query workload. (Sedge) also includes a set of workload analyzing algorithms whose time complexity is linear or sublinear to graph size. Empirical results show that it significantly improves distributed graph processing on today's commodity clusters.