Nonzero pattern analysis and memory access optimization in GPU-based sparse LU factorization for circuit simulation

Authors:
Xiaoming Chen;Du Su;Yu Wang;Huazhong Yang
Affiliations:
Tsinghua University, Beijing, China;Tsinghua University, Beijing, China;Tsinghua University, Beijing, China;Tsinghua University, Beijing, China
Venue:
IA^3 '13 Proceedings of the 3rd Workshop on Irregular Applications: Architectures and Algorithms
Year:
2013

Citing 17
Cited 0

Algorithm 679: A set of level 3 basic linear algebra subprograms: model implementation and test programs

ACM Transactions on Mathematical Software (TOMS)
Block methods for the solution of linear interval equations

SIAM Journal on Matrix Analysis and Applications
An Approximate Minimum Degree Ordering Algorithm

SIAM Journal on Matrix Analysis and Applications
An Asynchronous Parallel Supernodal Algorithm for Sparse Gaussian Elimination

SIAM Journal on Matrix Analysis and Applications
Templates for the solution of algebraic eigenvalue problems: a practical guide

Templates for the solution of algebraic eigenvalue problems: a practical guide
Solving unsymmetric sparse systems of linear equations with PARDISO

Future Generation Computer Systems - Special issue: Selected numerical algorithms
Quantifying Locality In The Memory Access Patterns of HPC Applications

SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
A memory model for scientific algorithms on graphics processors

Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Algorithm 907: KLU, A Direct Sparse Solver for Circuit Simulation Problems

ACM Transactions on Mathematical Software (TOMS)
Sparse Matrix Formats Evaluation and Optimization on a GPU

HPCC '10 Proceedings of the 2010 IEEE 12th International Conference on High Performance Computing and Communications
Exploiting Memory Access Patterns to Improve Memory Performance in Data-Parallel Architectures

IEEE Transactions on Parallel and Distributed Systems
Multifrontal computations on GPUs and their multi-core hosts

VECPAR'10 Proceedings of the 9th international conference on High performance computing for computational science
The university of Florida sparse matrix collection

ACM Transactions on Mathematical Software (TOMS)
Multifrontal Factorization of Sparse SPD Matrices on GPUs

IPDPS '11 Proceedings of the 2011 IEEE International Parallel & Distributed Processing Symposium
Dymaxion: optimizing memory access patterns for heterogeneous systems

Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
A CPU-GPU hybrid approach for the unsymmetric multifrontal method

Parallel Computing
Sparse LU factorization for parallel circuit simulation on GPU

Proceedings of the 49th Annual Design Automation Conference

Quantified Score

Hi-index	0.00

Visualization

Abstract

The sparse matrix solver is a critical component in circuit simulators. Some researches have developed GPU-based LU factorization approaches to accelerate the sparse solver. But the performance of these solvers is constrained by the irregularities of sparse matrices. This work investigates the nonzero patterns and memory access patterns in sparse LU factorization, and explores the common features to give guidelines on the improvements of the GPU solvers. We further propose a crisscross blocked implementation on GPUs. The proposed method attains average speedups of 1.68× compared with the unblocked method and 2.2× compared with 4-threaded PARDISO, for circuit matrices.