Improving Processor and Cache Locality in Fine-Grain Parallel Computations using Object-Affinity Scheduling and Continuation Passing

  • Authors:
  • Robert J. Fowler;Leonidas I. Kontothanassis

  • Affiliations:
  • -;-

  • Venue:
  • Improving Processor and Cache Locality in Fine-Grain Parallel Computations using Object-Affinity Scheduling and Continuation Passing
  • Year:
  • 1992

Quantified Score

Hi-index 0.01

Visualization

Abstract

On recent high-performance multiprocessors, there is a potential conflict between the goals of achieving the full performance potential of the hardware and providing a parallel programming environment that makes effective use of programmer effort. On one hand, an explicit coarse-grain programming style may appear to be necessary, both to achieve good cache performance and to limit the amount of overhead due to context switching and synchronization. On the other hand, it may be more expedient to use more natural and finer-grain programming styles based on abstractions such as task heaps, light-weight threads, parallel loops, or object-oriented parallelism. Unfortunately, using these styles can cause a loss of performance due to poor locality and high overhead. We claim that the locality issue in fine-grain parallel programs can be addressed effectively by using object-affinity scheduling and that the overhead can be reduced substantially by representing tasks as templates that are managed using continuation-passing style mechanisms. We present supporting evidence for these claims in the form of experimental measurements of programs running on Mercury, an object-oriented system implemented on an SGI 4D/480 multiprocessor.