Enabling large decoded instruction loop caching for energy-aware embedded processors

  • Authors:
  • Ji Gu;Hui Guo

  • Affiliations:
  • The University of New South Wales, Sydney, Australia;The University of New South Wales, Sydney, Australia

  • Venue:
  • CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Low energy consumption in embedded processors is increasingly important in step with the system complexity. The on-chip instruction cache (I-cache) is usually a most energy consuming component on the processor chip due to its large size and frequent access operations. To reduce such energy consumption, the existing loop cache approaches use a tiny decoded cache to filter the I-cache access and instruction decode activity for repeated loop iterations. However, such designs are effective to small and simple loops, and only suitable for DSP kernel-like applications. They are not effectual to many embedded applications where complex loops are common. In this paper, we propose a decoded loop instruction cache (DLIC) that is small, hence energy efficient, yet can capture most loops, including large, nested ones with branch executions, so that a significant amount of I-cache accesses and instruction decoding can be eradicated. Experiments on a set of embedded benchmarks show that our proposed DLIC scheme can reduce energy consumption by up to 87%. On average, 66% energy can be saved on instruction fetching and decoding, at a performance overhead of only 1.4%.