Sparse LU factorization for parallel circuit simulation on GPU

Authors:
Ling Ren;Xiaoming Chen;Yu Wang;Chenxi Zhang;Huazhong Yang
Affiliations:
Tsinghua University, Beijing, China;Tsinghua University, Beijing, China;Tsinghua University, Beijing, China;Tsinghua University, Beijing, China;Tsinghua University, Beijing, China
Venue:
Proceedings of the 49th Annual Design Automation Conference
Year:
2012

Citing 10
Cited 3

A Supernodal Approach to Sparse Partial Pivoting

SIAM Journal on Matrix Analysis and Applications
The Design and Use of Algorithms for Permuting Large Entries to the Diagonal of Sparse Matrices

SIAM Journal on Matrix Analysis and Applications
An Asynchronous Parallel Supernodal Algorithm for Sparse Gaussian Elimination

SIAM Journal on Matrix Analysis and Applications
Solving Unsymmetric Sparse Systems of Linear Equations with PARDISO

ICCS '02 Proceedings of the International Conference on Computational Science-Part II
SuperLU_DIST: A scalable distributed-memory sparse direct solver for unsymmetric linear systems

ACM Transactions on Mathematical Software (TOMS)
Algorithm 837: AMD, an approximate minimum degree ordering algorithm

ACM Transactions on Mathematical Software (TOMS)
Fast circuit simulation on graphics processing units

Proceedings of the 2009 Asia and South Pacific Design Automation Conference
Towards dense linear algebra for hybrid GPU accelerated manycore systems

Parallel Computing
Algorithm 907: KLU, A Direct Sparse Solver for Circuit Simulation Problems

ACM Transactions on Mathematical Software (TOMS)
The university of Florida sparse matrix collection

ACM Transactions on Mathematical Software (TOMS)

TinySPICE: a parallel SPICE simulator on GPU for massively repeated small circuit simulations

Proceedings of the 50th Annual Design Automation Conference
Nonzero pattern analysis and memory access optimization in GPU-based sparse LU factorization for circuit simulation

IA^3 '13 Proceedings of the 3rd Workshop on Irregular Applications: Architectures and Algorithms
Parallel power grid analysis using preconditioned GMRES solver on CPU-GPU platforms

Proceedings of the International Conference on Computer-Aided Design

Quantified Score

Hi-index	0.00

Visualization

Abstract

Sparse solver has become the bottleneck of SPICE simulators. There has been few work on GPU-based sparse solver because of the high data-dependency. The strong data-dependency determines that parallel sparse LU factorization runs efficiently on shared-memory computing devices. But the number of CPU cores sharing the same memory is often limited. The state of the art Graphic Processing Units (GPU) naturally have numerous cores sharing the device memory, and provide a possible solution to the problem. In this paper, we propose a GPU-based sparse LU solver for circuit simulation. We optimize the work partitioning, the number of active thread groups, and the memory access pattern, based on GPU architecture. On matrices whose factorization involves many floating-point operations, our GPU-based sparse LU factorization achieves 7.90x speedup over 1-core CPU and 1.49x speedup over 8-core CPU. We also analyze the scalability of parallel sparse LU factorization and investigate the specifications on CPUs and GPUs that most influence the performance.