Automated locality optimization based on the reuse distance of string operations

  • Authors:
  • Silvius Rus;Raksit Ashok;David Xinliang Li

  • Affiliations:
  • Google Inc. 1600 Amphitheatre Parkway Mountain View, CA 94043;Google India Pvt. Ltd., No. 3, RMZ Infinity - Tower E, Old Madras Road, Bangalore, 560 016, India;Google Inc., 1600 Amphitheatre Parkway, Mountain View, CA 94043

  • Venue:
  • CGO '11 Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

String operations such as memcpy, memset and memcmp account for a nontrivial amount of Google datacenter resources. String operations hurt processor cache efficiency when the data accessed is not reused shortly thereafter. Such cache pollution can be avoided by using nontemporal memory access to bypass L2/L3 caches. As reuse distance varies greatly across different memcpy static call contexts in the same program, an efficient solution needs to be call context sensitive. We propose a novel solution to this problem using the page protection mechanism to measure reuse distance and the GCC feedback directed optimization mechanism to generate nontemporal memory access instructions at the appropriate static code contexts. First, the compiler inserts instrumentation for calls to string operations. Then a run time library measures reuse distance using the page protection mechanism during a representative profiling run. The compiler finally generates calls to specialized string operations that use nontemporal operations for the arguments with large reuse distance. We present a full implementation and initial results including speedup on large datacenter applications.