Cache design trade-offs for power and performance optimization: a case study
ISLPED '95 Proceedings of the 1995 international symposium on Low power design
Complexity-effective superscalar processors
Proceedings of the 24th annual international symposium on Computer architecture
The filter cache: an energy efficient memory structure
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Power considerations in the design of the Alpha 21264 microprocessor
DAC '98 Proceedings of the 35th annual Design Automation Conference
Pipeline gating: speculation control for energy reduction
Proceedings of the 25th annual international symposium on Computer architecture
Using dynamic cache management techniques to reduce energy in a high-performance processor
ISLPED '99 Proceedings of the 1999 international symposium on Low power electronics and design
Selective cache ways: on-demand cache resource allocation
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Wattch: a framework for architectural-level power analysis and optimizations
Proceedings of the 27th annual international symposium on Computer architecture
ACM Transactions on Computer Systems (TOCS)
Power and energy reduction via pipeline balancing
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Reducing set-associative cache energy via way-prediction and selective direct-mapping
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Dynamically Exploiting Narrow Width Operands to Improve Processor Power and Performance
HPCA '99 Proceedings of the 5th International Symposium on High Performance Computer Architecture
Power optimization using dynamic power management
SBCCI'99 Proceedings of the XIIth conference on Integrated circuits and systems design
Low power synthesis of dynamic logic circuits using fine-grained clock gating
Proceedings of the conference on Design, automation and test in Europe: Proceedings
Low-power design techniques for scaled technologies
Integration, the VLSI Journal - Special issue: Low-power design techniques
A Floorplan-Aware Dynamic Inductive Noise Controller for Reliable Processor Design
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Low-power design techniques for scaled technologies
Integration, the VLSI Journal - Special issue: Low-power design techniques
Low power input/output port design using clock gating technique
ICNVS'10 Proceedings of the 12th international conference on Networking, VLSI and signal processing
VAIL: variation-aware issue logic and performance binning for processor yield and profit improvement
Proceedings of the 16th ACM/IEEE international symposium on Low power electronics and design
Asynchronous computing in sense amplifier-based pass transistor logic
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Integrated microarchitectural floorplanning and run-time controller for inductive noise mitigation
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Fast power- and slew-aware gated clock tree synthesis
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
ELEON3LP - Superscalar and low-power enhancements of single issue general purpose processor model
Microprocessors & Microsystems
Hi-index | 0.00 |
With the scaling of technology and the need for higher performance and more functionality, power dissipation is becoming a major bottleneck for microprocessor designs. Because clock power can be significant in high-performance processors, we propose a deterministic clock-gating (DCG) technique which effectively reduces clock power. DCG is based on the key observation that for many of the pipelined stages of a modern processor, the circuit block usage in the near future is known a few cycles ahead of time. Our experiments show an average of 19.9% reduction in processor power with virtually no performance loss for an eight-issue, out-of-order superscalar by applying DCG to execution units, pipeline latches, D-cache wordline decoders, and result bus drivers.