Memory performance at reduced CPU clock speeds: an analysis of current x86_64 processors

  • Authors:
  • Robert Schöne;Daniel Hackenberg;Daniel Molka

  • Affiliations:
  • Center for Information Services and High Performance Computing, Technische Universität Dresden, Dresden, Germany;Center for Information Services and High Performance Computing, Technische Universität Dresden, Dresden, Germany;Center for Information Services and High Performance Computing, Technische Universität Dresden, Dresden, Germany

  • Venue:
  • HotPower'12 Proceedings of the 2012 USENIX conference on Power-Aware Computing and Systems
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Reducing CPU frequency and voltage is a well-known approach to reduce the energy consumption of memory-bound applications. This is based on the conception that main memory performance sees little or no degradation at reduced processor clock speeds, while power consumption decreases significantly. We study this effect in detail on the latest generation of x86-64 compute nodes. Our results show that memory and last level cache bandwidths at reduced clock speeds strongly depend on the processor microarchitecture. For example, while an Intel Westmere-EP processor achieves 95% of the peak main memory bandwidth at the lowest processor frequency, the bandwidth decreases to only 60% on the latest Sandy Bridge-EP platform. Increased efficiency of memory-bound applications may also be achieved with concurrency throttling, i.e. reducing the number of active cores per socket. We therefore complete our study with a detailed analysis of memory bandwidth scaling at different concurrency levels on our test systems. Our results-both qualitative developments and absolute bandwidth numbers-are valuable for scientists in the areas of computer architecture, performance and power analysis and modeling as well as application developers seeking to optimize their codes on current x86-64 systems.