Automatic fence insertion for shared memory multiprocessing

  • Authors:
  • Xing Fang;Jaejin Lee;Samuel P. Midkiff

  • Affiliations:
  • Purdue University, West Lafayette, IN;Seoul National University, Seoul, Korea;Purdue University, West Lafayette, IN

  • Venue:
  • ICS '03 Proceedings of the 17th annual international conference on Supercomputing
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

In general, the hardware memory consistency model in a multiprocessor system is not identical to the memory model at the programming language level. Consequently, the programming language memory model must be mapped onto the hardware memory model. Memory fence instructions can be inserted by the compiler where needed to accomplish this mapping. We have developed and implemented several fence insertion and optimization algorithms in our Pensieve compiler project. We present the different fence insertion optimization techniques that were used in this system to guarantee sequential consistency at the language level, and compare them using performance data. Our techniques target two hardware relaxed memory consistency models provided by SMPs based on IBM Power 3 and Intel Pentium 4. Our fence insertion optimization shows up to 17.2% and 32.7% performance improvement on average, with the IBM PowerPC and Intel Pentium 4 (Xeon) multiprocessors respectively.