On iterative QR pre-processing in the parallel block-Jacobi SVD algorithm

  • Authors:
  • Martin Bečka;Gabriel Okša;Marián Vajteršic;Laura Grigori

  • Affiliations:
  • Institute of Mathematics, Dept. of Informatics, Slovak Academy of Sciences, Bratislava, Slovak Republic;Institute of Mathematics, Dept. of Informatics, Slovak Academy of Sciences, Bratislava, Slovak Republic;Dept. of Computer Sciences, University of Salzburg, Salzburg, Austria;INRIA, University Paris Sud-11, Orsay, France

  • Venue:
  • Parallel Computing
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

An efficient version of the parallel two-sided block-Jacobi algorithm for the singular value decomposition of an mxn matrix A includes the pre-processing step, which consists of the QR factorization of A with column pivoting followed by the optional LQ factorization of the R-factor. Then the iterative two-sided block-Jacobi algorithm is applied in parallel to the R-factor (or L-factor). For the efficient computation of the parallel QR (or LQ) factorization with (or without) column pivoting implemented in the ScaLAPACK, some matrix block cyclic distribution on a process grid rxc with p=rxc,r,c=1, and block size n"bxn"b is required so that all processors remain busy during the whole parallel QR (or LQ) factorization. Optimal values for parameters r, c and n"b are estimated experimentally using matrices of order n=4000 and 8000, and the number of processors p=8 and 16, respectively. It turns out that the optimal values are about n"b=100 and r=