A Two-Dimensional Data Distribution Method for Parallel Sparse Matrix-Vector Multiplication

Authors:
Brendan Vastenhouw;Rob H. Bisseling
Affiliations:
-;-
Venue:
SIAM Review
Year:
2005

Citing 0
Cited 36

New challanges in dynamic load balancing

Applied Numerical Mathematics - Adaptive methods for partial differential equations and large-scale computation
Efficient Data Distribution Schemes for EKMR-Based Sparse Arrays on Distributed Memory Multicomputers

The Journal of Supercomputing
Mondriaan sparse matrix partitioning for attacking cryptosystems by a parallel block Lanczos algorithm: a case study

Parallel Computing - Algorithmic skeletons
Data distribution schemes of sparse arrays on distributed memory multicomputers

The Journal of Supercomputing
Parallel multilevel algorithms for hypergraph partitioning

Journal of Parallel and Distributed Computing
Multi-level direct K-way hypergraph partitioning with multiple constraints and fixed vertices

Journal of Parallel and Distributed Computing
Optimization of sparse matrix-vector multiplication on emerging multicore platforms

Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Adaptive runtime tuning of parallel sparse matrix-vector multiplication on distributed memory systems

Proceedings of the 22nd annual international conference on Supercomputing
A Parallel Matrix Scaling Algorithm

High Performance Computing for Computational Science - VECPAR 2008
Optimization of sparse matrix-vector multiplication on emerging multicore platforms

Parallel Computing
Parallel sparse matrix-vector and matrix-transpose-vector multiplication using compressed sparse blocks

Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures
New challenges in dynamic load balancing

Applied Numerical Mathematics - Adaptive methods for partial differential equations and large-scale computation
Hierarchical test sequencing for complex systems

IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
Parallel symmetric sparse matrix-vector product on scalar multi-core CPUs

Parallel Computing
A Matrix Partitioning Interface to PaToH in MATLAB

Parallel Computing
Parallel greedy graph matching using an edge partitioning approach

Proceedings of the fourth international workshop on High-level parallel programming and applications
Hypergraph-based multilevel matrix approximation for text information retrieval

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
A scalable parallel union-find algorithm for distributed memory computers

PPAM'09 Proceedings of the 8th international conference on Parallel processing and applied mathematics: Part I
Parallel hypergraph partitioning for scientific computing

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
On Two-Dimensional Sparse Matrix Partitioning: Models, Methods, and a Recipe

SIAM Journal on Scientific Computing
A scalable eigensolver for large scale-free graphs using 2D graph partitioning

Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Two implementations of the preconditioned conjugate gradient method on heterogeneous computing grids

International Journal of Applied Mathematics and Computer Science - Computational Intelligence in Modern Control Systems
Parallel algorithms for bipartite matching problems on distributed memory computers

Parallel Computing
Two-dimensional cache-oblivious sparse matrix-vector multiplication

Parallel Computing
Hypergraph-Based Unsymmetric Nested Dissection Ordering for Sparse LU Factorization

SIAM Journal on Scientific Computing
A general graph model for representing exact communication volume in parallel sparse matrix–vector multiplication

ISCIS'06 Proceedings of the 21st international conference on Computer and Information Sciences
Replicated partitioning for undirected hypergraphs

Journal of Parallel and Distributed Computing
Hypergraph partitioning for faster parallel pagerank computation

EPEW'05/WS-FM'05 Proceedings of the 2005 international conference on European Performance Engineering, and Web Services and Formal Methods, international conference on Formal Techniques for Computer Systems and Business Processes
On partitioning problems with complex objectives

Euro-Par'11 Proceedings of the 2011 international conference on Parallel Processing
An object-oriented bulk synchronous parallel library for multicore programming

Concurrency and Computation: Practice & Experience
Partitioning Hypergraphs in Scientific Computing Applications through Vertex Separators on Graphs

SIAM Journal on Scientific Computing
A GPU algorithm for greedy graph matching

Facing the Multicore-Challenge II
Load-balancing spatially located computations using rectangular partitions

Journal of Parallel and Distributed Computing
Parallel computation of continuous Petri nets based on hypergraph partitioning

The Journal of Supercomputing
Scalable matrix computations on large scale-free graphs using 2D graph partitioning

SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
A new metric enabling an exact hypergraph model for the communication volume in distributed-memory parallel applications

Parallel Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

A new method is presented for distributing data in sparse matrix-vector multiplication. The method is two-dimensional, tries to minimize the true communication volume, and also tries to spread the computation and communication work evenly over the processors. The method starts with a recursive bipartitioning of the sparse matrix, each time splitting a rectangular matrix into two parts with a nearly equal number of nonzeros. The communication volume caused by the split is minimized. After the matrix partitioning, the input and output vectors are partitioned with the objective of minimizing the maximum communication volume per processor. Experimental results of our implementation, Mondriaan, for a set of sparse test matrices show a reduction in communication volume compared to one-dimensional methods, and in general a good balance in the communication work. Experimental timings of an actual parallel sparse matrix-vector multiplication on an SGI Origin 3800 computer show that a sufficiently large reduction in communication volume leads to savings in execution time.