Direct methods for sparse matrices
Direct methods for sparse matrices
Symbolic factorization for sparse Gaussian elimination with partial pivoting
SIAM Journal on Scientific and Statistical Computing
Threshold pivoting for dense LU factorization on distributed memory multiprocessors
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Parallel sparse LU decomposition on a mesh network of transputers
SIAM Journal on Matrix Analysis and Applications
Implementing an irregular application on a distributed memory multiprocessor
PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
ICS '94 Proceedings of the 8th international conference on Supercomputing
Elimination forest guided 2D sparse LU factorization
Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
A Supernodal Approach to Sparse Partial Pivoting
SIAM Journal on Matrix Analysis and Applications
The Design and Use of Algorithms for Permuting Large Entries to the Diagonal of Sparse Matrices
SIAM Journal on Matrix Analysis and Applications
S+: Efficient 2D Sparse LU Factorization on Parallel Machines
SIAM Journal on Matrix Analysis and Applications
On Algorithms For Permuting Large Entries to the Diagonal of a Sparse Matrix
SIAM Journal on Matrix Analysis and Applications
A Fully Asynchronous Multifrontal Solver Using Distributed Dynamic Scheduling
SIAM Journal on Matrix Analysis and Applications
SuperLU_DIST: A scalable distributed-memory sparse direct solver for unsymmetric linear systems
ACM Transactions on Mathematical Software (TOMS)
Using Postordering and Static Symbolic Factorization for Parallel Sparse LU
IPDPS '00 Proceedings of the 14th International Symposium on Parallel and Distributed Processing
Sparse gaussian elimination on high-performance computers
Sparse gaussian elimination on high-performance computers
Solving unsymmetric sparse systems of linear equations with PARDISO
Future Generation Computer Systems - Special issue: Selected numerical algorithms
The university of Florida sparse matrix collection
ACM Transactions on Mathematical Software (TOMS)
Hi-index | 0.00 |
Several message passing-based parallel solvers have been developed for general (non-symmetric) sparse LU factorization with partial pivoting. Due to the fine-grain synchronization and large communication volume between computing nodes for this application, existing solvers are mostly intended to run on tightly-coupled parallel computing platforms with high message passing performance (e.g., 1-10 μs in message latency and 100-1000 Mbytes/sec in message throughput). In order to utilize platforms with slower message passing, this paper investigates techniques that can significantly reduce the application's communication needs. In particular, we propose batch pivoting to make pivot selections in groups through speculative factorization, and thus substantially decrease the inter-processor synchronization granularity. We experimented with an MPI-based implementation on several message passing platforms. While the speculative batch pivoting provides no performance benefit and even slightly weakens the numerical stability on an IBM Regatta multiprocessor with fast message passing, it improves the performance of our test matrices by 28-292% on an Ethernet-connected 16-node PC cluster. We also evaluated several other communication reduction techniques and showed that they are not as effective as our proposed approach.