An Estimation of Complexity and Computational Costs for Vertical Block-Cyclic Distributed Parallel LU Factorization

  • Authors:
  • Toshiyuki Imamura

  • Affiliations:
  • Center for Promotion of Computational Science and Engineering, Japan Atomic Energy Research Institute, 2-2-54 Nakameguro, Meguro-ku, Tokyo 153, Japan imamura@koma.jaeri.go.jp

  • Venue:
  • The Journal of Supercomputing
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

The Vertical Block–cyclic Distributed Parallel LU Factorization Method (VBPLU) is effectively processed on a distributed memory parallel computer. VBPLU is based on the two techniques, the block algorithm and the aggregation of communications. Since startup time dominates the data communication and the aggregation reduces communication isssues, the total performance has been much improved. Furthermore this method uses long vectors so that it is also advantageous on vector processors. In this paper, we have constructed a modeling of VBPLU using a simplified LogGP model with analytical formulae, and estimated accurately the computational cost taking into account load distributions caused by data layout and process mapping. Some knowledge for optimization of block algorithm has been obtained. Our estimations have been verified through numerical experiments on three different distributed memory parallel computers.