Reducing energy usage with memory and computation-aware dynamic frequency scaling

  • Authors:
  • Michael A. Laurenzano;Mitesh Meswani;Laura Carrington;Allan Snavely;Mustafa M. Tikir;Stephen Poole

  • Affiliations:
  • San Diego Supercomputer Center, La Jolla, CA;San Diego Supercomputer Center, La Jolla, CA;San Diego Supercomputer Center, La Jolla, CA;San Diego Supercomputer Center, La Jolla, CA;Google, Inc, Mountain View, CA;Oak Ridge National Laboratory, Oak Ridge, TN

  • Venue:
  • Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Over the life of a modern supercomputer, the energy cost of running the system can exceed the cost of the original hardware purchase. This has driven the community to attempt to understand and minimize energy costs wherever possible. Towards these ends, we present an automated, fine-grained approach to selecting per-loop processor clock frequencies. The clock frequency selection criteria is established through a combination of lightweight static analysis and runtime tracing that automatically acquires application signatures - characterizations of the patterns of execution of each loop in an application. This application characterization is matched with one of a series of benchmark loops, which have been run on the target system and probe it in various ways. These benchmarks form a covering set, a machine characterization of the expected power consumption and performance traits of the machine over the space of execution patterns and clock frequencies. The frequency that confers the optimal behavior in terms of power-delay product for the benchmark that most closely resembles each application loop is the one chosen for that loop. The set of tools that implement this scheme is fully automated, built on top of freely available open source software, and uses an inexpensive power measurement apparatus. We use these tools to show a measured, system-wide energy savings of up to 7.6% on an 8-core Intel Xeon E5530 and 10.6% on a 32-core AMD Opteron 8380 (a Sun X4600 Node) across a range of workloads.