Scalable iterative solution of sparse linear systems
Parallel Computing
A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs
SIAM Journal on Scientific Computing
Parallel threshold-based ILU factorization
SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
A Fully Asynchronous Multifrontal Solver Using Distributed Dynamic Scheduling
SIAM Journal on Matrix Analysis and Applications
Parallel Computing - Parallel matrix algorithms and applications
Proceedings of the 11 IPPS/SPDP'99 Workshops Held in Conjunction with the 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing
Iterative Methods for Sparse Linear Systems
Iterative Methods for Sparse Linear Systems
Solving unsymmetric sparse systems of linear equations with PARDISO
Future Generation Computer Systems - Special issue: Selected numerical algorithms
Multilevel Preconditioners Constructed From Inverse-Based ILUs
SIAM Journal on Scientific Computing
Design, Tuning and Evaluation of Parallel Multilevel ILU Preconditioners
High Performance Computing for Computational Science - VECPAR 2008
Inertia-Revealing Preconditioning For Large-Scale Nonconvex Constrained Optimization
SIAM Journal on Scientific Computing
Algebraic Multilevel Preconditioner for the Helmholtz Equation in Heterogeneous Media
SIAM Journal on Scientific Computing
Parallelization of multilevel ILU preconditioners on distributed-memory multiprocessors
PARA'10 Proceedings of the 10th international conference on Applied Parallel and Scientific Computing - Volume Part I
Leveraging task-parallelism in energy-efficient ILU preconditioners
ICT-GLOW'12 Proceedings of the Second international conference on ICT as Key Technology against Global Warming
Benefits of using parallelized non-progressive network coding
Journal of Network and Computer Applications
Hi-index | 0.00 |
We investigate the efficient iterative solution of large-scale sparse linear systems on shared-memory multiprocessors. Our parallel approach is based on a multilevel ILU preconditioner which preserves the mathematical semantics of the sequential method in ILUPACK. We exploit the parallelism exposed by the task tree corresponding to the nested dissection hierarchy (task parallelism), employ dynamic scheduling of tasks to processors to improve load balance, and formulate all stages of the parallel PCG method conformal with the computation of the preconditioner to increase data reuse. Results on a CC-NUMA platform with 16 processors reveal the parallel efficiency of this solution.