Compiler Support for Data Forwarding in Scalable Shared-Memory Multiprocessors

  • Authors:
  • David Koufaty;Josep Torrellas

  • Affiliations:
  • -;-

  • Venue:
  • ICPP '99 Proceedings of the 1999 International Conference on Parallel Processing
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

As the difference in speed between processor and memory system continues to increase, it is becoming crucial to develop and refine techniques that enhance the effectiveness of cache hierarchies. One promising technique in the context of scalable shared-memory multiprocessors is data forwarding. Forwarding hides the latency of communication-induced misses by having producer processors send data to the caches of potential consumer processors in advance. Forwarding can hide the latency effectively, has low instruction overhead, and uses few machine resources.This paper presents a complete implementation of a data forwarding pass in an industrial-strength parallelizing compiler. Complete Fortran applications are analyzed for dependences and, based on the analysis, automatically annotated with forwarding directives. We propose a forwarding framework that includes 4 new instructions: write-forward, write-broadcast, write-update}, and write-through. New micro-architectural support is proposed.In our analysis, we assume that the assignment of loop iterations to processors is known. We perform simulations of multiprocessors with different cache, memory, machine sharing, and process migration parameters. We conclude that data forwarding delivers large speedups (six 32-processor applications ran an average of 40% faster), gets close to the upper bound in performance, and needs compiler support of only medium complexity.