Parallel processor balance through loop spreading

  • Authors:
  • Y. Wu;T. Lewis

  • Affiliations:
  • Sequent Computer Systems, Inc, Beaverton, OR;Oregon State University, Corvallis, OR

  • Venue:
  • Proceedings of the 1989 ACM/IEEE conference on Supercomputing
  • Year:
  • 1989

Quantified Score

Hi-index 0.00

Visualization

Abstract

When the number of processors P is less than the number of tasks N in a parallel loop, the loop has to be executed in ⌈N/P⌉ rounds and the last round executes only (N mod P) tasks. In many cases, in the last round all but a few processors are idle, which causes a significant drop in performance. This performance drop becomes more and more detrimental as the number of processors increases. Loop spreading is a technique for restructuring parallel loops so as to balance parallel tasks on multiple processors. A spread loop runs at least as fast as the non-spread loop even when N mod P = 0, and shows no performance drop when N changes. We show how the method keeps the performance of the matrix multiplication and a simplex algorithm from decreasing as the size of input changes.