The computation and communication complexity of a parallel banded system solver
ACM Transactions on Mathematical Software (TOMS)
SIAM Journal on Scientific and Statistical Computing
An improved spectral graph partitioning algorithm for mapping parallel computations
SIAM Journal on Scientific Computing
A conjugate gradient method for the spectral partitioning of graphs
Parallel Computing
Experimental study of ILU preconditioners for indefinite matrices
Journal of Computational and Applied Mathematics
A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs
SIAM Journal on Scientific Computing
On Stable Parallel Linear System Solvers
Journal of the ACM (JACM)
Orderings for Incomplete Factorization Preconditioning of Nonsymmetric Problems
SIAM Journal on Scientific Computing
The Design and Use of Algorithms for Permuting Large Entries to the Diagonal of Sparse Matrices
SIAM Journal on Matrix Analysis and Applications
On Optimal Banded Preconditioners for the Five-Point Laplacian
SIAM Journal on Matrix Analysis and Applications
Practical Parallel Band Triangular System Solvers
ACM Transactions on Mathematical Software (TOMS)
On Algorithms for Obtaining a Maximum Transversal
ACM Transactions on Mathematical Software (TOMS)
Preconditioning Highly Indefinite and Nonsymmetric Matrices
SIAM Journal on Scientific Computing
On Algorithms For Permuting Large Entries to the Diagonal of a Sparse Matrix
SIAM Journal on Matrix Analysis and Applications
A Fully Asynchronous Multifrontal Solver Using Distributed Dynamic Scheduling
SIAM Journal on Matrix Analysis and Applications
Implementing Hager's exchange methods for matrix profile reduction
ACM Transactions on Mathematical Software (TOMS)
Reducing the bandwidth of sparse symmetric matrices
ACM '69 Proceedings of the 1969 24th national conference
LAPACK Working Note 20: A Portable Linear Algebra Library For High-Performance Computers
LAPACK Working Note 20: A Portable Linear Algebra Library For High-Performance Computers
Hybrid scheduling for the parallel solution of linear systems
Parallel Computing - Parallel matrix algorithms and applications (PMAA'04)
A parallel hybrid banded system solver: the SPIKE algorithm
Parallel Computing - Parallel matrix algorithms and applications (PMAA'04)
On some parallel banded system solvers
Parallel Computing
Hi-index | 0.00 |
The emergence of multicore architectures and highly scalable platforms motivates the development of novel algorithms and techniques that emphasize concurrency and are tolerant of deep memory hierarchies, as opposed to minimizing raw FLOP counts. While direct solvers are reliable, they are often slow and memory-intensive for large problems. Iterative solvers, on the other hand, are more efficient but, in the absence of robust preconditioners, lack reliability. While preconditioners based on incomplete factorizations (whenever they exist) are effective for many problems, their parallel scalability is generally limited. In this paper, we advocate the use of banded preconditioners instead and introduce a reordering strategy that enables their extraction. In contrast to traditional bandwidth reduction techniques, our reordering strategy takes into account the magnitude of the matrix entries, bringing the heaviest elements closer to the diagonal, thus enabling the use of banded preconditioners. When used with effective banded solvers—in our case, the Spike solver—we show that banded preconditioners (i) are more robust compared to the broad class of incomplete factorization-based preconditioners, (ii) deliver higher processor performance, resulting in faster time to solution, and (iii) scale to larger parallel configurations. We demonstrate these results experimentally on a large class of problems selected from diverse application domains.