ACM Transactions on Mathematical Software (TOMS)
Block methods for the solution of linear interval equations
SIAM Journal on Matrix Analysis and Applications
An Approximate Minimum Degree Ordering Algorithm
SIAM Journal on Matrix Analysis and Applications
An Asynchronous Parallel Supernodal Algorithm for Sparse Gaussian Elimination
SIAM Journal on Matrix Analysis and Applications
Templates for the solution of algebraic eigenvalue problems: a practical guide
Templates for the solution of algebraic eigenvalue problems: a practical guide
Solving unsymmetric sparse systems of linear equations with PARDISO
Future Generation Computer Systems - Special issue: Selected numerical algorithms
Quantifying Locality In The Memory Access Patterns of HPC Applications
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
A memory model for scientific algorithms on graphics processors
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Algorithm 907: KLU, A Direct Sparse Solver for Circuit Simulation Problems
ACM Transactions on Mathematical Software (TOMS)
Sparse Matrix Formats Evaluation and Optimization on a GPU
HPCC '10 Proceedings of the 2010 IEEE 12th International Conference on High Performance Computing and Communications
Exploiting Memory Access Patterns to Improve Memory Performance in Data-Parallel Architectures
IEEE Transactions on Parallel and Distributed Systems
Multifrontal computations on GPUs and their multi-core hosts
VECPAR'10 Proceedings of the 9th international conference on High performance computing for computational science
The university of Florida sparse matrix collection
ACM Transactions on Mathematical Software (TOMS)
Multifrontal Factorization of Sparse SPD Matrices on GPUs
IPDPS '11 Proceedings of the 2011 IEEE International Parallel & Distributed Processing Symposium
Dymaxion: optimizing memory access patterns for heterogeneous systems
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
A CPU-GPU hybrid approach for the unsymmetric multifrontal method
Parallel Computing
Sparse LU factorization for parallel circuit simulation on GPU
Proceedings of the 49th Annual Design Automation Conference
Hi-index | 0.00 |
The sparse matrix solver is a critical component in circuit simulators. Some researches have developed GPU-based LU factorization approaches to accelerate the sparse solver. But the performance of these solvers is constrained by the irregularities of sparse matrices. This work investigates the nonzero patterns and memory access patterns in sparse LU factorization, and explores the common features to give guidelines on the improvements of the GPU solvers. We further propose a crisscross blocked implementation on GPUs. The proposed method attains average speedups of 1.68× compared with the unblocked method and 2.2× compared with 4-threaded PARDISO, for circuit matrices.