Techniques for parallel manipulation of sparse matrices
Theoretical Computer Science - Special issue on high performance computer systems
Sparse matrices in matlab: design and implementation
SIAM Journal on Matrix Analysis and Applications
Efficient algorithms for all-to-all communications in multi-port message-passing systems
SPAA '94 Proceedings of the sixth annual ACM symposium on Parallel algorithms and architectures
A three-dimensional approach to parallel matrix multiplication
IBM Journal of Research and Development
A multigrid tutorial: second edition
A multigrid tutorial: second edition
Two Fast Algorithms for Sparse Matrices: Multiplication and Permuted Transposition
ACM Transactions on Mathematical Software (TOMS)
A Flexible Class of Parallel Matrix Multiplication Algorithms
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
A cellular computer to implement the kalman filter algorithm
A cellular computer to implement the kalman filter algorithm
Fast sparse matrix multiplication
ACM Transactions on Algorithms (TALG)
Efficient transitive closure of sparse matrices over closed semirings
Theoretical Computer Science - Algebraic methods in language processing
Graph Clustering Via a Discrete Uncoupling Process
SIAM Journal on Matrix Analysis and Applications
A Unified Framework for Numerical and Combinatorial Computing
Computing in Science and Engineering
Challenges and Advances in Parallel Sparse Matrix-Matrix Multiplication
ICPP '08 Proceedings of the 2008 37th International Conference on Parallel Processing
Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
Communication-optimal parallel 2.5D matrix multiplication and LU factorization algorithms
Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part II
Improving communication performance in dense linear algebra via topology aware collectives
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
The Combinatorial BLAS: design, implementation, and applications
International Journal of High Performance Computing Applications
Space-round tradeoffs for MapReduce computations
Proceedings of the 26th ACM international conference on Supercomputing
Proceedings of the twenty-fourth annual ACM symposium on Parallelism in algorithms and architectures
Communication-optimal parallel algorithm for strassen's matrix multiplication
Proceedings of the twenty-fourth annual ACM symposium on Parallelism in algorithms and architectures
Communication-Optimal Parallel Recursive Rectangular Matrix Multiplication
IPDPS '13 Proceedings of the 2013 IEEE 27th International Symposium on Parallel and Distributed Processing
Communication costs of Strassen's matrix multiplication
Communications of the ACM
Hi-index | 0.02 |
Parallel algorithms for sparse matrix-matrix multiplication typically spend most of their time on inter-processor communication rather than on computation, and hardware trends predict the relative cost of communication will only increase. Thus, sparse matrix multiplication algorithms must minimize communication costs in order to scale to large processor counts. In this paper, we consider multiplying sparse matrices corresponding to Erdős-Rényi random graphs on distributed-memory parallel machines. We prove a new lower bound on the expected communication cost for a wide class of algorithms. Our analysis of existing algorithms shows that, while some are optimal for a limited range of matrix density and number of processors, none is optimal in general. We obtain two new parallel algorithms and prove that they match the expected communication cost lower bound, and hence they are optimal.