Compile-time techniques for efficient utilization of parallel memories

  • Authors:
  • Rajiv Gupta;Mary Lou Soffa

  • Affiliations:
  • Philips Laboratories, North American Philips Corporation, Briarcliff Manor, NY;Dept. of Computer Science, University of Pittsburgh, Pittsburgh, Pa

  • Venue:
  • PPEALS '88 Proceedings of the ACM/SIGPLAN conference on Parallel programming: experience with applications, languages and systems
  • Year:
  • 1988

Quantified Score

Hi-index 0.00

Visualization

Abstract

The partitioning of shared memory into a number of memory modules is an approach to achieve high memory bandwidth for parallel processors. Memory access conflicts can occur when several processors simultaneously request data from the same memory module. Although work has been done to improve access performance for vectors, no work has been reported to improve the access performance of scalars. For systems in which the processors operate in a lock-step mode, a large percentage of memory access conflicts can be predicted at compile-time. These conflicts can be avoided by appropriate distribution of data among the memory modules at compile-time. A long instruction word machine is an example of a system in which the functional units operate in a lock-step mode performing operations on data fetched in parallel from multiple memory modules. In this paper, compile-time techniques for distribution of scalars to avoid memory access conflicts are presented. Furthermore, algorithms to schedule data transfers among memory modules to avoid conflicts that cannot be avoided by the distribution of values alone are developed. The techniques have been implemented as part of a compiler for a reconfigurable long instruction word architecture. Results of experiments are presented demonstrating that a very high percentage of memory access conflicts can be avoided by scheduling a very low number of data transfers.