On variable blocking factor in a parallel dynamic block: Jacobi SVD algorithm

  • Authors:
  • Martin Bečka;Gabriel Okša

  • Affiliations:
  • Department of Informatics, Institute of Mathematics, Slovak Academy of Sciences, Dúbravská cesta 9, 841 04 Bratislava, Slovak Republic;Department of Informatics, Institute of Mathematics, Slovak Academy of Sciences, Dúbravská cesta 9, 841 04 Bratislava, Slovak Republic

  • Venue:
  • Parallel Computing - Parallel matrix algorithms and applications (PMAA '02)
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

The parallel two-sided block-Jacobi singular value decomposition (SVD) algorithm with dynamic ordering, originally proposed in [Parallel Comput. 28 (2002) 243-262], has been extended with respect to the blocking factor ℓ. Unlike the unique blocking factor ℓ = 2p in the original algorithm running on p processors, the current blocking factor is a variable parameter that covers the values in two different regions--namely, ℓ = p/k and ℓ = 2kp for some integer k. Two new parallel two-sided block-Jacobi SVD algorithms with dynamic ordering are described in detail. They arise in those two regions and differ in the logical data arrangement and communication complexity of the reordering step. For the case of ℓ = 2kp, it is proved that a designed point-to-point communication algorithm is optimal with respect to the amount of communication required per processor as well as to the amount of overall communication. Using the message passing programming model for distributed memory machines, new parallel block-Jacobi SVD algorithms were implemented on an SGI-Cray Origin 2000 parallel computer. Numerical experiments were performed on p = 12 and 24 processors using a set of six matrices of order 4000 and blocking factors ℓ, 2 ≤ ℓ ≤ 192. To achieve the minimal total parallel execution time, the use of a blocking factor ℓ ∈ {2,p, 2p} can be recommended for matrices with distinct singular values. However, for matrices with a multiple minimal singular value, the total parallel execution time may monotonically increase with ℓ. In this case, the recommended Jacobi method with ℓ = 2 is just the ScaLAPACK routine with some additional matrix multiplications, and it computes the SVD in one parallel iteration step.