Direct methods for sparse matrices
Direct methods for sparse matrices
An extended set of FORTRAN basic linear algebra subprograms
ACM Transactions on Mathematical Software (TOMS)
Parallel solution of triangular systems on distributed-memory multiprocessors
SIAM Journal on Scientific and Statistical Computing
Solving planar systems of equations on distributed-memory multiprocessors
Solving planar systems of equations on distributed-memory multiprocessors
A set of level 3 basic linear algebra subprograms
ACM Transactions on Mathematical Software (TOMS)
The role of elimination trees in sparse factorization
SIAM Journal on Matrix Analysis and Applications
Parallel algorithms for sparse linear systems
SIAM Review
Factorized sparse approximate inverse preconditionings I: theory
SIAM Journal on Matrix Analysis and Applications
On finding supernodes for sparse matrix computations
SIAM Journal on Matrix Analysis and Applications
Iterative solution methods
A Sparse Approximate Inverse Preconditioner for the Conjugate Gradient Method
SIAM Journal on Scientific Computing
Matrix computations (3rd ed.)
Parallel Preconditioning with Sparse Approximate Inverses
SIAM Journal on Scientific Computing
Incomplete Cholesky Factorizations with Limited Memory
SIAM Journal on Scientific Computing
Computer Solution of Large Sparse Positive Definite
Computer Solution of Large Sparse Positive Definite
Iterative Methods for Sparse Linear Systems
Iterative Methods for Sparse Linear Systems
SIAM Journal on Scientific Computing
SIAM Journal on Scientific Computing
Parallel hybrid sparse solvers through flexible incomplete cholesky preconditioning
PARA'04 Proceedings of the 7th international conference on Applied Parallel Computing: state of the Art in Scientific Computing
Hi-index | 0.00 |
This paper concerns latency-tolerant schemes for the efficient parallel solution of sparse triangular linear systems on distributed memory multiprocessors. Such triangular solution is required when sparse Cholesky factors are used to solve for a sequence of right-hand-side vectors or when incomplete sparse Cholesky factors are used to precondition a Conjugate Gradients iterative solver. In such applications, the use of traditional distributed substitution schemes can create a performance bottleneck when the latency of interprocessor communication is large. We had earlier developed the Selective Inversion (SI) scheme to reduce communication latency costs by replacing distributed substitution by parallel matrix vector multiplication. We now present a new two-way mapping of the triangular sparse matrix to processors to improve the performance of SI by halving its communication latency costs. We provide analytic results for model sparse matrices and we report on the performance of our scheme for parallel preconditioning with incomplete sparse Cholesky factors.