Average-case stability of Gaussian elimination
SIAM Journal on Matrix Analysis and Applications
LAPACK's user's guide
Proceedings of the fourth workshop on I/O in parallel and distributed systems: part of the federated computing research conference
Using PLAPACK: parallel linear algebra package
Using PLAPACK: parallel linear algebra package
ScaLAPACK user's guide
Locality of Reference in LU Decomposition with Partial Pivoting
SIAM Journal on Matrix Analysis and Applications
A survey of out-of-core algorithms in numerical linear algebra
External memory algorithms
FLAME: Formal Linear Algebra Methods Environment
ACM Transactions on Mathematical Software (TOMS)
Accuracy and Stability of Numerical Algorithms
Accuracy and Stability of Numerical Algorithms
The Design and Implementation of the Parallel Out-of-coreScaLAPACK LU, QR, and Cholesky Factorization Routines
POOCLAPACK: Parallel Out-of-Core Linear Algebra Package
POOCLAPACK: Parallel Out-of-Core Linear Algebra Package
The science of deriving dense linear algebra algorithms
ACM Transactions on Mathematical Software (TOMS)
Parallel out-of-core computation and updating of the QR factorization
ACM Transactions on Mathematical Software (TOMS)
Updating an LU Factorization with Pivoting
ACM Transactions on Mathematical Software (TOMS)
Programming matrix algorithms-by-blocks for thread-level parallelism
ACM Transactions on Mathematical Software (TOMS)
Using desktop computers to solve large-scale dense linear algebra problems
The Journal of Supercomputing
ACM Transactions on Mathematical Software (TOMS)
CALU: A Communication Optimal LU Factorization Algorithm
SIAM Journal on Matrix Analysis and Applications
Hi-index | 0.01 |
In this paper, we discuss a more scalable OOC implementation of a dense linear system solver via LU factorization that presents numerical stability similar to that of the LU factorization with partial pivoting. Our implementation builds on the Formal Linear Algebra Methods Environment (FLAME), the Parallel Linear Algebra Package (PLAPACK), and the Parallel Out-of-Core Linear Algebra Package (POOCLAPACK) infrastructures. Experimental results on an Intel Itanium2 (R) platform demonstrate the high performance of this approach.