Sparse LU factorization with partial pivoting on distributed memory machines

  • Authors:
  • Cong Fu;Tao Yang

  • Affiliations:
  • Department of Computer Science, University of California, Santa Barbara, CA;Department of Computer Science, University of California, Santa Barbara, CA

  • Venue:
  • Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
  • Year:
  • 1996

Quantified Score

Hi-index 0.00

Visualization

Abstract

Sparse LU factorization with partial pivoting is important to many scientific applications, but the effective parallelization of this algorithm is still an open problem. The main difficulty is that partial pivoting operations make structures of L and U factors unpredictable beforehand. This paper presents a novel approach called S* for parallelizing this problem on distributed memory machines. S* incorporates static symbolic factorization to avoid run-time control overhead and uses nonsymmetric L/U supernode partitioning and amalgamation strategies to maximize the use of BLAS-3 routines. The irregular task parallelism embedded in sparse LU is exploited using graph scheduling and efficient run-time support techniques which optimize communication, overlap computation with communication and balance processor loads. The experimental results on the Cray-T3D with a set of Harwell-Boeing nonsymmetric matrices are very encouraging and good scalability has been achieved. Even compared to a highly optimized sequential code, the parallel speedups are still impressive considering the current status of sparse LU research.