Evaluating the locality benefits of active messages

  • Authors:
  • Ellen Spertus;William J. Dally

  • Affiliations:
  • Microsoft Research, 1 Microsoft Way, Redmond, WA and Laboratory for Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts;Laboratory for Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts

  • Venue:
  • PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
  • Year:
  • 1995

Quantified Score

Hi-index 0.00

Visualization

Abstract

A major challenge in fine-grained computing is achieving locality without excessive scheduling overhead. We built two J-Machine implementations of a fine-grained programming model, the Berkeley Threaded Abstract Machine. One implementation takes an Active Messages approach, maintaining a scheduling hierarchy in software in order to improve data cache performance. Another approach relies on the J-Machine's message queues and fast task switch, lowering the control costs at the expense of data locality. Our analysis measures the costs and benefits of each approach, for a variety of programs and cache configurations. The Active Messages implementation is strongest when miss penalties are high and for the finest-grained programs. The hardware-buffered implementation is strongest in direct-mapped caches, where it achieves substantially better instruction cache performance.