Processor Aware Anticipatory Prefetching in Loops

  • Authors:
  • Spiros Kalogeropulos;Mahadevan Rajagopalan;Vikram Rao;Yonghong Song;Partha Tirumalai

  • Affiliations:
  • Sun Microsystems, Inc.;Sun Microsystems, Inc.;Sun Microsystems, Inc.;Sun Microsystems, Inc.;Sun Microsystems, Inc.

  • Venue:
  • HPCA '04 Proceedings of the 10th International Symposium on High Performance Computer Architecture
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

As microprocessor speeds increase, a large fraction of the execution time is often lost to cache miss penalties. This loss can be particularly severe in processors such as the UltraSPARC-IIICu which have in-order execution and block on cache misses. Such processors rely greatly on the compiler to reduce stalls and achieve high performance. This paper describes a compiler technique for software prefetching that is aware of the specific prefetch behaviors of the target processor. The implementation targets loops containing control-flow and strided or irregular memory access patterns. A two phase locality analysis, capable of handling complex subscript expressions, is used for enhanced identification of prefetch candidates. Prefetch instructions are scheduled with careful consideration of the prefetch behaviors in the target system. Compared to a previous implementation, our technique produced performance improvements of 9% on the geometric mean, and up to 44% on individual tests, in Sunýs first UltraSPARC-IIICu based SPEC CPU2000 submission [5] and has been used in all later submissions to date.