An O(n2 log n) parallel max-flow algorithm
Journal of Algorithms
An algorithm for drawing general undirected graphs
Information Processing Letters
A bridging model for parallel computation
Communications of the ACM
Graph drawing by force-directed placement
Software—Practice & Experience
Active messages: a mechanism for integrated communication and computation
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Cilk: an efficient multithreaded runtime system
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Co-array Fortran for parallel programming
ACM SIGPLAN Fortran Forum
Basic Linear Algebra Subprograms for Fortran Usage
ACM Transactions on Mathematical Software (TOMS)
Graph separators, with applications
Graph separators, with applications
On Identifying Strongly Connected Components in Parallel
IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
A Parallelization of Dijkstra's Shortest Path Algorithm
MFCS '98 Proceedings of the 23rd International Symposium on Mathematical Foundations of Computer Science
On sparse graphs with dense long paths.
On sparse graphs with dense long paths.
GASNet Specification, v1.1
Δ-stepping: a parallelizable shortest path algorithm
Journal of Algorithms
Lifting sequential graph algorithms for distributed-memory parallel computation
OOPSLA '05 Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
X10: an object-oriented approach to non-uniform cluster computing
OOPSLA '05 Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Implementation and performance analysis of non-blocking collective operations for MPI
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Proceedings of the 22nd annual international conference on Supercomputing
A Unified Framework for Numerical and Combinatorial Computing
Computing in Science and Engineering
Dynamic task scheduling for linear algebra algorithms on distributed-memory multicore systems
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
AM++: a generalized active message framework
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Linear algebraic primitives for parallel computing on large graphs
Linear algebraic primitives for parallel computing on large graphs
The STAPL parallel container framework
Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
Active pebbles: parallel programming for data-driven applications
Proceedings of the international conference on Supercomputing
Extensible PGAS semantics for C++
Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model
The Combinatorial BLAS: design, implementation, and applications
International Journal of High Performance Computing Applications
Sparse matrices in Matlab*P: design and implementation
HiPC'04 Proceedings of the 11th international conference on High Performance Computing
OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems
Computing in Science and Engineering
Introducing ScaleGraph: an X10 library for billion scale graph analytics
Proceedings of the 2012 ACM SIGPLAN X10 Workshop
Global Futures: A Multithreaded Execution Model for Global Arrays-based Applications
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Expressing graph algorithms using generalized active messages
Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
Hi-index | 0.00 |
Recently, graph computation has emerged as an important class of high-performance computing application whose characteristics differ markedly from those of traditional, compute-bound kernels. Libraries such as BLAS, LAPACK, and others have been successful in codifying best practices in numerical computing. The data-driven nature of graph applications necessitates a more complex application stack incorporating runtime optimization. In this paper, we present a method of phrasing graph algorithms as collections of asynchronous, concurrently executing, concise code fragments which may be invoked both locally and in remote address spaces. A runtime layer performs a number of dynamic optimizations, including message coalescing, message combining, and software routing. We identify a number of common patterns in these algorithms, and explore how this programming model can express those patterns. Algorithmic transformations are discussed which expose asyn- chrony that can be leveraged by the runtime to improve performance and reduce resource utilization. Practical implementations and performance results are provided for a number of representative algorithms.