Evaluating Integrated Hardware-Software Optimizations Using a Unified Energy Estimation Framework

  • Authors:
  • N. Vijaykrishnan;Mahmut Kandemir;Mary Jane Irwin;Hyun Suk Kim;Wu Ye;David Duarte

  • Affiliations:
  • -;-;-;-;-;-

  • Venue:
  • IEEE Transactions on Computers
  • Year:
  • 2003

Quantified Score

Hi-index 14.98

Visualization

Abstract

With the emergence of a plethora of embedded and portable applications, energy dissipation has joined throughput, VLSI layout area, and accuracy/precision as a major design constraint. Thus, designers must be concerned with both estimating and optimizing the energy consumption of circuits, architectures, and software. Most of the research in energy optimization and/or estimation has focused on single components of the system and has not looked across the interacting spectrum of the hardware and software. The novelty of our energy estimation framework, SimplePower, is that it evaluates the energy considering the system as a whole rather than just as a sum of parts, and that it concurrently supports both compiler and architectural experimentation. We present the design and use of the SimplePower framework that includes a transition-sensitive, cycle-accurate datapath energy model that interfaces with analytical and transition-sensitive energy models for the memory, clock and bus subsystems, respectively. Such an architectural-level energy estimation framework is invaluable in making good energy-conscious decisions early in the design cycle. We analyzed the energy consumption of 10 codes from the multidimensional array domain, a domain that is important for embedded video and signal processing systems. Our study shows that the pipeline registers and the register file are the datapath energy hotspots consuming 58-70 percent of overall datapath energy and that the clocking of the on-chip memory structures is the major source of the on-chip clock networks energy consumption. Further, we find that the off-chip main memory is the overall energy bottleneck of the entire system. However, we found that the application of high-level compiler optimizations reduces the main memory energy significantly, causing the contribution of the data cache, on-chip clock network, instruction cache, and datapath to become more important. We found that the improved locality of the optimized codes is useful in not only reducing the accesses to the main memory but also in exploiting the more energy-efficient cache architectures much better than unoptimized codes. Optimized codes saved 21 percent more energy using the most recently used way-prediction cache scheme as compared to executing unoptimized codes from the multidimensional array domain. We also observed that emerging technologies such as embedded DRAM coupled with a combination of energy-efficient circuit, architectural and compiler optimizations can potentially shift the energy hotspot. Thus, we have demonstrated that early estimates from the powerful SimplePower energy estimation framework can help one to identify the system energy hot spots and enable architects and compiler designers to focus their efforts on these areas.