Prefetch injection based on hardware monitoring and object metadata

  • Authors:
  • Ali-Reza Adl-Tabatabai;Richard L. Hudson;Mauricio J. Serrano;Sreenivas Subramoney

  • Affiliations:
  • Intel Corporation, Santa Clara, CA;Intel Corporation, Santa Clara, CA;Intel Corporation, Santa Clara, CA;Intel Corporation, Santa Clara, CA

  • Venue:
  • Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Cache miss stalls hurt performance because of the large gap between memory and processor speeds - for example, the popular server benchmark SPEC JBB2000 spends 45% of its cycles stalled waiting for memory requests on the Itanium® 2 processor. Traversing linked data structures causes a large portion of these stalls. Prefetching for linked data structures remains a major challenge because serial data dependencies between elements in a linked data structure preclude the timely materialization of prefetch addresses. This paper presents Mississippi Delta (MS Delta), a novel technique for prefetching linked data structures that closely integrates the hardware performance monitor (HPM), the garbage collector's global view of heap and object layout, the type-level metadata inherent in type-safe programs, and JIT compiler analysis. The garbage collector uses the HPM's data cache miss information to identify cache miss intensive traversal paths through linked data structures, and then discovers regular distances (deltas) between these linked objects. JIT compiler analysis injects prefetch instructions using deltas to materialize prefetch addresses.We have implemented MS Delta in a fully dynamic profile-guided optimization system: the StarJIT dynamic compiler [1] and the ORP Java virtual machine [9]. We demonstrate a 28-29% reduction in stall cycles attributable to the high-latency cache misses targeted by MS Delta and a speedup of 11-14% on the cache miss intensive SPEC JBB2000 benchmark.