Parallel, multigrain iterative solvers for hiding network latencies on MPPs and networks of clusters

  • Authors:
  • James R. McCombs;Andreas Stathopoulos

  • Affiliations:
  • Department of Computer Science, College of William and Mary, Williamsburg, VA;Department of Computer Science, College of William and Mary, Williamsburg, VA

  • Venue:
  • Parallel Computing - Parallel matrix algorithms and applications (PMAA '02)
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Parallel iterative solvers are often the only means of solving large linear systems and eigenproblems. However, these solvers are usually implemented in a fine-grain manner and can incur significant performance penalties due to synchronization overheads on large MPPs. This problem is exacerbated in clusters of workstations (COWs) and SMPs that are interconnected via a hierarchy of networks. In this paper, we describe a novel scheme for hiding the synchronization overheads, and thus improving scalability, of block iterative solvers that employ a correction equation through an inner iterative method.Block methods are not only robust in the presence of eigenvalue multiplicities and multiple right-hand sides, but provide better latency tolerance by performing more floating-point operations between synchronizations. We take a different approach to inducing latency tolerance by increasing the granularity at which the correction equation is solved for each block vector. This is accomplished by splitting the processors into smaller subgroups which are then used to solve the correction for each block vector concurrently. The rest of the algorithm is still performed in fine grain. We call this combination of fine and coarse-grain parallelism multigrain parallelism.We implemented a multigrain, block Jacobi-Davidson algorithm for computing the extreme eigenvalues of a symmetric matrix. We obtained improvements of 45-50% over both the block and non-block implementations of the fine-grain method when testing on an IBM SP and on a collection of clusters of Sun workstations.