An Asynchronous Parallel Supernodal Algorithm for Sparse Gaussian Elimination
SIAM Journal on Matrix Analysis and Applications
On Algorithms For Permuting Large Entries to the Diagonal of a Sparse Matrix
SIAM Journal on Matrix Analysis and Applications
SuperLU_DIST: A scalable distributed-memory sparse direct solver for unsymmetric linear systems
ACM Transactions on Mathematical Software (TOMS)
Sparse gaussian elimination on high-performance computers
Sparse gaussian elimination on high-performance computers
An overview of SuperLU: Algorithms, implementation, and user interface
ACM Transactions on Mathematical Software (TOMS) - Special issue on the Advanced CompuTational Software (ACTS) Collection
Optimization of sparse matrix-vector multiplication on emerging multicore platforms
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Hi-index | 0.04 |
The Chip Multiprocessor (CMP) will be the basic building block for computer systems ranging from laptops to supercomputers. New software developments at all levels are needed to fully utilize these systems. In this work, we evaluate performance of different high-performance sparse LU factorization and triangular solution algorithms on several representative multicore machines. We include both pthreads and MPI implementations in this study, and found that the pthreads implementation consistently delivers good performance and a left-looking algorithm is usually superior.