Brief announcement: Lower bounds on communication for sparse Cholesky factorization of a model problem

Authors:
Laura Grigori;Pierre-Yves David;James W. Demmel;Sylvain Peyronnet
Affiliations:
INRIA Saclay-Ile de France, Orsay, France;INRIA Saclay-Ile de France, Orsay, France;Departments of Mathematics and CS, University of California Berkeley, Berkeley, USA;LRI, Université Paris-Sud 11, Orsay, France
Venue:
Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
Year:
2010

Citing 5
Cited 3

Highly Scalable Parallel Algorithms for Sparse Matrix Factorization

IEEE Transactions on Parallel and Distributed Systems
I/O complexity: The red-blue pebble game

STOC '81 Proceedings of the thirteenth annual ACM symposium on Theory of computing
Communication lower bounds for distributed-memory matrix multiplication

Journal of Parallel and Distributed Computing
Communication avoiding Gaussian elimination

Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Communication-optimal parallel and sequential Cholesky decomposition: extended abstract

Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures

Managing data-movement for effective shared-memory parallelization of out-of-core sparse solvers

SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Graph expansion and communication costs of fast matrix multiplication

Journal of the ACM (JACM)
Communication optimal parallel multiplication of sparse random matrices

Proceedings of the twenty-fifth annual ACM symposium on Parallelism in algorithms and architectures

Quantified Score

Hi-index	0.00

Visualization

Abstract

Previous work has shown that a lower bound on the number of words moved between large, slow memory and small, fast memory of size M by any conventional (non-Strassen like) direct linear algebra algorithm (matrix multiply, the LU, Cholesky, QR factorizations,...) is Ω(# flops / √ (M)). This holds for dense or sparse matrices. There are analogous lower bounds for the number of messages, and for parallel algorithms instead of sequential algorithms. Our goal here is to find algorithms that attain these lower bounds on interesting classes of sparse matrices. We focus on matrices for which there is a lower bound on the number of flops of their Cholesky factorization. Our Cholesky lower bounds on communication hold for any possible ordering of the rows and columns of the matrix, and so are globally optimal in this sense. For matrices arising from discretization on two dimensional and three dimensional regular grids, we discuss sequential and parallel algorithms that are optimal in terms of communication. The algorithms turn out to require combining previously known sparse and dense Cholesky algorithms in simple ways