Scalable Implementation of Efficient Locality Approximation

  • Authors:
  • Xipeng Shen;Jonathan Shaw

  • Affiliations:
  • Computer Science Department, The College of William and Mary, Williamsburg;Shaw Technologies, Inc., Tualatin, OR,

  • Venue:
  • Languages and Compilers for Parallel Computing
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

As memory hierarchy becomes deeper and shared by more processors, locality increasingly determines system performance. As a rigorous and precise locality model, reuse distance has been used in program optimizations, performance prediction, memory disambiguation, and locality phase prediction. However, the high cost of measurement has been severely impeding its uses in scenarios requiring high efficiency, such as product compilers, performance debugging, run-time optimizations. We recently discovered the statistical connection between time and reuse distance, which led to an efficient way to approximate reuse distance using time. However, not exposed are some algorithmic and implementation techniques that are vital for the efficiency and scalability of the approximation model. This paper presents these techniques. It describes an algorithm that approximates reuse distance on arbitrary scales; it explains a portable scheme that employs memory controller to accelerate the measure of time distance; it uncovers the algorithm and proof of a trace generator that can facilitate various locality studies.