Parallel implementation of multifrontal schemes
Parallel Computing
Direct methods for sparse matrices
Direct methods for sparse matrices
Solution of sparse positive definite systems on a shared-memory multiprocessor
International Journal of Parallel Programming
Task scheduling for parallel sparse Cholesky factorization
International Journal of Parallel Programming
Parallel algorithms for sparse linear systems
SIAM Review
A supernodal Cholesky factorization algorithm for shared-memory multiprocessors
SIAM Journal on Scientific Computing
Cilk: an efficient multithreaded runtime system
Journal of Parallel and Distributed Computing - Special issue on multithreading for multiprocessors
A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs
SIAM Journal on Scientific Computing
An Asynchronous Parallel Supernodal Algorithm for Sparse Gaussian Elimination
SIAM Journal on Matrix Analysis and Applications
The Multifrontal Solution of Indefinite Sparse Symmetric Linear
ACM Transactions on Mathematical Software (TOMS)
A Fully Asynchronous Multifrontal Solver Using Distributed Dynamic Scheduling
SIAM Journal on Matrix Analysis and Applications
PASTIX: a high-performance parallel direct solver for sparse symmetric positive definite systems
Parallel Computing - Parallel matrix algorithms and applications
Sparse Matrix Ordering with SCOTCH
HPCN Europe '97 Proceedings of the International Conference and Exhibition on High-Performance Computing and Networking
IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
MA57---a code for the solution of sparse symmetric definite and indefinite systems
ACM Transactions on Mathematical Software (TOMS)
Algorithm 832: UMFPACK V4.3---an unsymmetric-pattern multifrontal method
ACM Transactions on Mathematical Software (TOMS)
Parallel and fully recursive multifrontal sparse Cholesky
Future Generation Computer Systems - Special issue: Selected numerical algorithms
Solving unsymmetric sparse systems of linear equations with PARDISO
Future Generation Computer Systems - Special issue: Selected numerical algorithms
A fully portable high performance minimal storage hybrid format Cholesky algorithm
ACM Transactions on Mathematical Software (TOMS)
An overview of SuperLU: Algorithms, implementation, and user interface
ACM Transactions on Mathematical Software (TOMS) - Special issue on the Advanced CompuTational Software (ACTS) Collection
Parallel unsymmetric-pattern multifrontal sparse LU with column preordering
ACM Transactions on Mathematical Software (TOMS)
Algorithm 887: CHOLMOD, Supernodal Sparse Cholesky Factorization and Update/Downdate
ACM Transactions on Mathematical Software (TOMS)
Intel threading building blocks
Intel threading building blocks
An out-of-core sparse Cholesky solver
ACM Transactions on Mathematical Software (TOMS)
PFunc: modern task parallelism for modern high performance computing
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
The impact of multicore on math software
PARA'06 Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing
The university of Florida sparse matrix collection
ACM Transactions on Mathematical Software (TOMS)
Algorithm 915, SuiteSparseQR: Multifrontal multithreaded rank-revealing sparse QR factorization
ACM Transactions on Mathematical Software (TOMS)
Pivoting strategies for tough sparse indefinite systems
ACM Transactions on Mathematical Software (TOMS)
Hi-index | 0.00 |
The rapid emergence of multicore machines has led to the need to design new algorithms that are efficient on these architectures. Here, we consider the solution of sparse symmetric positive-definite linear systems by Cholesky factorization. We were motivated by the successful division of the computation in the dense case into tasks on blocks and use of a task manager to exploit all the parallelism that is available between these tasks, whose dependencies may be represented by a directed acyclic graph (DAG). Our sparse algorithm is built on the assembly tree and subdivides the work at each node into tasks on blocks of the Cholesky factor. The dependencies between these tasks may again be represented by a DAG. To limit memory requirements, blocks are updated directly rather than through generated-element matrices. Our algorithm is implemented within a new efficient and portable solver HSL_MA87. It is written in Fortran 95 plus OpenMP and is available as part of the software library HSL. Using problems arising from a range of applications, we present experimental results that support our design choices and demonstrate that HSL_MA87 obtains good serial and parallel times on our 8-core test machines. Comparisons are made with existing modern solvers and show that HSL_MA87 performs well, particularly in the case of very large problems.