DVFS in loop accelerators using BLADES

Authors:
Ganesh Dasika;Shidhartha Das;Kevin Fan;Scott Mahlke;David Bull
Affiliations:
University of Michigan - Ann Arbor, Ml;ARM, Ltd., Cambridge, United Kingdom;University of Michigan - Ann Arbor, Ml;University of Michigan - Ann Arbor, Ml;ARM, Ltd., Cambridge, United Kingdom
Venue:
Proceedings of the 45th annual Design Automation Conference
Year:
2008

Citing 8
Cited 1

Iterative modulo scheduling: an algorithm for software pipelining loops

MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Design methodology of ultra low-power MPEG4 codec core exploiting voltage scaling techniques

DAC '98 Proceedings of the 35th annual Design Automation Conference
Closed-loop adaptive voltage scaling controller for standard-cell ASICs

Proceedings of the 2002 international symposium on Low power electronics and design
Cycle-time aware architecture synthesis of custom hardware accelerators

CASES '02 Proceedings of the 2002 international conference on Compilers, architecture, and synthesis for embedded systems
Time Redundancy Based Soft-Error Tolerance to Rescue Nanometer Technologies

VTS '99 Proceedings of the 1999 17TH IEEE VLSI Test Symposium
Using Dynamic Binary Translation to Fuse Dependent Instructions

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Static strands: safely collapsing dependence chains for increasing embedded power efficiency

LCTES '05 Proceedings of the 2005 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Cost Sensitive Modulo Scheduling in a Loop Accelerator Synthesis System

Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture

Error-resilient low-power DSP via path-delay shaping

Proceedings of the 48th Design Automation Conference

Quantified Score

Hi-index	0.00

Visualization

Abstract

Hardware accelerators are common in embedded systems that have high performance requirements but must still operate within stringent energy constraints. To facilitate short time-to-market and reduced non-recurring engineering costs, automatic systems that can rapidly generate hardware bearing both power and performance in mind are extremely attractive. This paper proposes the BLADES (Better-than-worst-case Loop Accelerator Design) system for automatically designing self-tuning hardware accelerators that dynamically select their best operating frequency and voltage based on environmental conditions, silicon variation, and input data characteristics. Errors in operation are detected by Razor flip-flops, and recovery is initiated. The architecture efficiently supports detection, rollback, and recovery to provide a highly adaptable and configurable loop accelerator. The overhead of deploying Razor flip-flops is significantly reduced by automatically chaining primitive computation operations together. Results on a range of loop accelerators show average energy savings of 32% gained by voltage scaling below the nominal supply voltage.