Systolic block-Jacobi SVD algorithm for processor meshes

  • Authors:
  • G. Okša;M. Vajteršic

  • Affiliations:
  • Institute for Mathematics, Slovak Academy of Sciences, Bratislava, Slovak Republic;Institute for Mathematics, Slovak Academy of Sciences, Bratislava, Slovak Republic

  • Venue:
  • Highly parallel computaions
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

We design the systolic version of the block-Jacobi SVD algorithm for the singular value decomposition (SVD) of matrix A ∈ Rn×n. The algorithm involves the class CO of parallel orderings on the two-dimensional toroidal mesh with p processors. The mathematical background is based on the QR decomposition of local data matrices and on Kogbetliantz's algorithm for local SVDs in the diagonal processors. Subsequent updates of local matrices in the diagonal as well as nondiagonal processors are required. We show that all updates can be realized by orthogonal (modified) Givens rotations. These rotations can be efficiently pipelined in parallel in the horizontal and vertical rings of √p processors through the toroidal mesh. Our solution requires O(3n2) systolic processing elements, O(3n2) memory registers, O(n2) additional delay elements and O[(4(2√p + r + 1) - 2r+4/√p)sn] time steps where s is the number of global sweeps in the block-Jacobi algorithm and r is the number of local sweeps in Kogbetliantz's algorithm.