The Energy Impact of Aggressive Loop Fusion

  • Authors:
  • YongKang Zhu;Grigorios Magklis;Michael L. Scott;Chen Ding;David H. Albonesi

  • Affiliations:
  • University of Rochester, New York;Intel Barcelona Research Center, Barcelona, Spain;University of Rochester, New York;University of Rochester, New York;University of Rochester, New York

  • Venue:
  • Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Loop fusion combines corresponding iterations of different loops. It is traditionally used to decrease program run time, by reducing loop overhead and increasing data locality. In this paper, however, we consider its effect on energy. the uniformity, or balance of demand for system resources. On a conventional superscalar processor, increased balance tends to increase IPC, and thus dynamic power, so that fusion-induced improvements in program energy are slightly smaller than improvements in program run time. If IPC is held constant, however, by reducing frequency and voltage-particularly on a processor with multiple clock domains-then energy improvements may significantly exceed run time improvements. We demonstrate the benefits of increased program balance under a theoretical model of processor energy consumption. We then evaluate the benefits of fusion empirically on synthetic and real-world benchmarks, using our existing loop-fusing compiler and a heavily modified version of the SimpleScalar/Wattch simulator. For the real-world benchmarks, we demonstrate energy savings ranging from 7-40%, with run times ranging from 1% slowdown to 17% speedup. In addition to validating our theoretical model, the simulation results allow us to "tease apart" the factors that contribute to fusion-induced time and energy savings.