Parallel and fully recursive multifrontal sparse Cholesky

Authors:
Dror Irony;Gil Shklarski;Sivan Toledo
Affiliations:
School of Computer Science, Tel-Aviv University, Tel-Aviv 69978, Israel;School of Computer Science, Tel-Aviv University, Tel-Aviv 69978, Israel;School of Computer Science, Tel-Aviv University, Tel-Aviv 69978, Israel
Venue:
Future Generation Computer Systems - Special issue: Selected numerical algorithms
Year:
2004

Citing 29
Cited 7

Parallel implementation of multifrontal schemes

Parallel Computing
On the storage requirement in the out-of-core multifrontal method for sparse factorization

ACM Transactions on Mathematical Software (TOMS)
The influence of relaxed supernode partitions on the multifrontal method

ACM Transactions on Mathematical Software (TOMS)
The multifrontal method and paging in sparse Cholesky factorization

ACM Transactions on Mathematical Software (TOMS)
Algorithm 679: A set of level 3 basic linear algebra subprograms: model implementation and test programs

ACM Transactions on Mathematical Software (TOMS)
A set of level 3 basic linear algebra subprograms

ACM Transactions on Mathematical Software (TOMS)
The multifrontal method for sparse matrix solution: theory and practice

SIAM Review
Sparse matrices in matlab: design and implementation

SIAM Journal on Matrix Analysis and Applications
Improving performance of linear algebra algorithms for dense matrices, using algorithmic prefetch

IBM Journal of Research and Development
Exploiting functional parallelism of POWER2 to design high-performance numerical algorithms

IBM Journal of Research and Development
DXML: a high-performance scientific subroutine library

Digital Technical Journal
Executing multithreaded programs efficiently

Executing multithreaded programs efficiently
Highly Scalable Parallel Algorithms for Sparse Matrix Factorization

IEEE Transactions on Parallel and Distributed Systems
Optimizing matrix multiply using PHiPAC: a portable, high-performance, ANSI C coding methodology

ICS '97 Proceedings of the 11th international conference on Supercomputing
ScaLAPACK user's guide

ScaLAPACK user's guide
Locality of Reference in LU Decomposition with Partial Pivoting

SIAM Journal on Matrix Analysis and Applications
The implementation of the Cilk-5 multithreaded language

PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Recursion leads to automatic variable blocking for dense linear-algebra algorithms

IBM Journal of Research and Development
The Multifrontal Solution of Indefinite Sparse Symmetric Linear

ACM Transactions on Mathematical Software (TOMS)
A recursive formulation of Cholesky factorization of a matrix in packed storage

ACM Transactions on Mathematical Software (TOMS)
A Fully Asynchronous Multifrontal Solver Using Distributed Dynamic Scheduling

SIAM Journal on Matrix Analysis and Applications
PASTIX: a high-performance parallel direct solver for sparse symmetric positive definite systems

Parallel Computing - Parallel matrix algorithms and applications
Parallel and Fully Recursive Multifrontal Supernodal Sparse Cholesky

ICCS '02 Proceedings of the International Conference on Computational Science-Part II
Recursive Blocked Data Formats and BLAS's for Dense Linear Algebra Algorithms

PARA '98 Proceedings of the 4th International Workshop on Applied Parallel Computing, Large Scale Scientific and Industrial Problems
LAPACK Working Note 55: ScaLAPACK: A Scalable Linear Algebra Library for Distributed Memory Concurrent Computers

LAPACK Working Note 55: ScaLAPACK: A Scalable Linear Algebra Library for Distributed Memory Concurrent Computers
LAPACK Working Note 94: A User''s Guide to the BLACS v1.0

LAPACK Working Note 94: A User''s Guide to the BLACS v1.0
Automatically Tuned Linear Algebra Software

Automatically Tuned Linear Algebra Software
Applying recursion to serial and parallel QR factorization leads to better performance

IBM Journal of Research and Development
Minimal-storage high-performance Cholesky factorization via blocking and recursion

IBM Journal of Research and Development

Algebraic analysis of high-pass quantization

ACM Transactions on Graphics (TOG)
Supermatrix out-of-order scheduling of matrix operations for SMP and multi-core architectures

Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures
PyTrilinos: High-performance distributed-memory solvers for Python

ACM Transactions on Mathematical Software (TOMS)
Parallel unsymmetric-pattern multifrontal sparse LU with column preordering

ACM Transactions on Mathematical Software (TOMS)
On the design of interfaces to sparse direct solvers

ACM Transactions on Mathematical Software (TOMS)
AMESOS: a set of general interfaces to sparse direct solver libraries

PARA'06 Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing
Design of a Multicore Sparse Cholesky Factorization Using DAGs

SIAM Journal on Scientific Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

We describe the design, implementation, and performance of a new parallel sparse Cholesky factorization code. The code uses a multifrontal factorization strategy. Operations on small dense submatrices are performed using new dense matrix subroutines that are part of the code, although the code can also use the BLAS and LAPACK. The new code is recursive at both the sparse and the dense levels, it uses a novel recursive data layout for dense submatrices, and it is parallelized using Cilk, an extension of C specifically designed to parallelize recursive codes. We demonstrate that the new code performs well and scales well on SMPs. In particular, on up to 16 processors, the code outperforms two state-of-the-art message-passing codes. The scalability and high performance that the code achieves imply that recursive schedules, blocked data layouts, and dynamic scheduling are effective in the implementation of sparse factorization codes.