A bridging model for parallel computation
Communications of the ACM
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
The webgraph framework I: compression techniques
Proceedings of the 13th international conference on World Wide Web
UbiCrawler: a scalable fully distributed web crawler
Software—Practice & Experience
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
PEGASUS: A Peta-Scale Graph Mining System Implementation and Observations
ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
Pregel: a system for large-scale graph processing
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Large graph processing in the cloud
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Twister: a runtime for iterative MapReduce
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Spark: cluster computing with working sets
HotCloud'10 Proceedings of the 2nd USENIX conference on Hot topics in cloud computing
HaLoop: efficient iterative data processing on large clusters
Proceedings of the VLDB Endowment
Signal/collect: graph algorithms for the (semantic) web
ISWC'10 Proceedings of the 9th international semantic web conference on The semantic web - Volume Part I
Proceedings of the 20th international conference on World wide web
Site-Based Partitioning and Repartitioning Techniques for Parallel PageRank Computation
IEEE Transactions on Parallel and Distributed Systems
iMapReduce: A Distributed Computing Framework for Iterative Computation
IPDPSW '11 Proceedings of the 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and PhD Forum
The Combinatorial BLAS: design, implementation, and applications
International Journal of High Performance Computing Applications
Green-Marl: a DSL for easy and efficient graph analysis
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
Towards effective partition management for large graphs
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Streaming graph partitioning for large distributed graphs
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Simplifying Scalable Graph Processing with a Domain-Specific Language
Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization
Hi-index | 0.00 |
GPS (for Graph Processing System) is a complete open-source system we developed for scalable, fault-tolerant, and easy-to-program execution of algorithms on extremely large graphs. This paper serves the dual role of describing the GPS system, and presenting techniques and experimental results for graph partitioning in distributed graph-processing systems like GPS. GPS is similar to Google's proprietary Pregel system, with three new features: (1) an extended API to make global computations more easily expressed and more efficient; (2) a dynamic repartitioning scheme that reassigns vertices to different workers during the computation, based on messaging patterns; and (3) an optimization that distributes adjacency lists of high-degree vertices across all compute nodes to improve performance. In addition to presenting the implementation of GPS and its novel features, we also present experimental results on the performance effects of both static and dynamic graph partitioning schemes, and we describe the compilation of a high-level domain-specific programming language to GPS, enabling easy expression of complex algorithms.