Parallel implementation of multifrontal schemes
Parallel Computing
On the storage requirement in the out-of-core multifrontal method for sparse factorization
ACM Transactions on Mathematical Software (TOMS)
The influence of relaxed supernode partitions on the multifrontal method
ACM Transactions on Mathematical Software (TOMS)
The multifrontal method and paging in sparse Cholesky factorization
ACM Transactions on Mathematical Software (TOMS)
ACM Transactions on Mathematical Software (TOMS)
A set of level 3 basic linear algebra subprograms
ACM Transactions on Mathematical Software (TOMS)
Sparse matrices in matlab: design and implementation
SIAM Journal on Matrix Analysis and Applications
Improving performance of linear algebra algorithms for dense matrices, using algorithmic prefetch
IBM Journal of Research and Development
Exploiting functional parallelism of POWER2 to design high-performance numerical algorithms
IBM Journal of Research and Development
DXML: a high-performance scientific subroutine library
Digital Technical Journal
Executing multithreaded programs efficiently
Executing multithreaded programs efficiently
Highly Scalable Parallel Algorithms for Sparse Matrix Factorization
IEEE Transactions on Parallel and Distributed Systems
Optimizing matrix multiply using PHiPAC: a portable, high-performance, ANSI C coding methodology
ICS '97 Proceedings of the 11th international conference on Supercomputing
ScaLAPACK user's guide
Locality of Reference in LU Decomposition with Partial Pivoting
SIAM Journal on Matrix Analysis and Applications
The implementation of the Cilk-5 multithreaded language
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Recursion leads to automatic variable blocking for dense linear-algebra algorithms
IBM Journal of Research and Development
The Multifrontal Solution of Indefinite Sparse Symmetric Linear
ACM Transactions on Mathematical Software (TOMS)
A recursive formulation of Cholesky factorization of a matrix in packed storage
ACM Transactions on Mathematical Software (TOMS)
A Fully Asynchronous Multifrontal Solver Using Distributed Dynamic Scheduling
SIAM Journal on Matrix Analysis and Applications
PASTIX: a high-performance parallel direct solver for sparse symmetric positive definite systems
Parallel Computing - Parallel matrix algorithms and applications
Parallel and Fully Recursive Multifrontal Supernodal Sparse Cholesky
ICCS '02 Proceedings of the International Conference on Computational Science-Part II
Recursive Blocked Data Formats and BLAS's for Dense Linear Algebra Algorithms
PARA '98 Proceedings of the 4th International Workshop on Applied Parallel Computing, Large Scale Scientific and Industrial Problems
LAPACK Working Note 55: ScaLAPACK: A Scalable Linear Algebra Library for Distributed Memory Concurrent Computers
LAPACK Working Note 94: A User''s Guide to the BLACS v1.0
LAPACK Working Note 94: A User''s Guide to the BLACS v1.0
Automatically Tuned Linear Algebra Software
Automatically Tuned Linear Algebra Software
Applying recursion to serial and parallel QR factorization leads to better performance
IBM Journal of Research and Development
Minimal-storage high-performance Cholesky factorization via blocking and recursion
IBM Journal of Research and Development
Algebraic analysis of high-pass quantization
ACM Transactions on Graphics (TOG)
Supermatrix out-of-order scheduling of matrix operations for SMP and multi-core architectures
Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures
PyTrilinos: High-performance distributed-memory solvers for Python
ACM Transactions on Mathematical Software (TOMS)
Parallel unsymmetric-pattern multifrontal sparse LU with column preordering
ACM Transactions on Mathematical Software (TOMS)
On the design of interfaces to sparse direct solvers
ACM Transactions on Mathematical Software (TOMS)
AMESOS: a set of general interfaces to sparse direct solver libraries
PARA'06 Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing
Design of a Multicore Sparse Cholesky Factorization Using DAGs
SIAM Journal on Scientific Computing
Hi-index | 0.00 |
We describe the design, implementation, and performance of a new parallel sparse Cholesky factorization code. The code uses a multifrontal factorization strategy. Operations on small dense submatrices are performed using new dense matrix subroutines that are part of the code, although the code can also use the BLAS and LAPACK. The new code is recursive at both the sparse and the dense levels, it uses a novel recursive data layout for dense submatrices, and it is parallelized using Cilk, an extension of C specifically designed to parallelize recursive codes. We demonstrate that the new code performs well and scales well on SMPs. In particular, on up to 16 processors, the code outperforms two state-of-the-art message-passing codes. The scalability and high performance that the code achieves imply that recursive schedules, blocked data layouts, and dynamic scheduling are effective in the implementation of sparse factorization codes.