Profile-based dynamic voltage and frequency scaling for a multiple clock domain microprocessor

  • Authors:
  • Grigorios Magklis;Michael L. Scott;Greg Semeraro;David H. Albonesi;Steven Dropsho

  • Affiliations:
  • University of Rochester, Rochester, NY;University of Rochester, Rochester, NY;University of Rochester, Rochester, NY;University of Rochester, Rochester, NY;University of Rochester, Rochester, NY

  • Venue:
  • Proceedings of the 30th annual international symposium on Computer architecture
  • Year:
  • 2003

Quantified Score

Hi-index 0.01

Visualization

Abstract

A Multiple Clock Domain (MCD) processor addresses the challenges of clock distribution and power dissipation by dividing a chip into several (coarse-grained) clock domains, allowing frequency and voltage to be reduced in domains that are not currently on the application's critical path. Given a reconfiguration mechanism capable of choosing appropriate times and values for voltage/frequency scaling, an MCD processor has the potential to achieve significant energy savings with low performance degradation.Early work on MCD processors evaluated the potential for energy savings by manually inserting reconfiguration instructions into applications, or by employing an oracle driven by off-line analysis of (identical) prior program runs. Subsequent work developed a hardware-based on-line mechanism that averages 75--85% of the energy-delay improvement achieved via off-line analysis.In this paper we consider the automatic insertion of reconfiguration instructions into applications, using profile-driven binary rewriting. Profile-based reconfiguration introduces the need for "training runs" prior to production use of a given application, but avoids the hardware complexity of on-line reconfiguration. It also has the potential to yield significantly greater energy savings. Experimental results (training on small data sets and then running on larger, alternative data sets) indicate that the profile-driven approach is more stable than hardware-based reconfiguration, and yields virtually all of the energy-delay improvement achieved via off-line analysis.