Common-case computation: a high-level technique for power and performance optimization
Proceedings of the 36th annual ACM/IEEE Design Automation Conference
Automatic Synthesis of Large Telescopic Units Based on Near-Minimum Timed Supersetting
IEEE Transactions on Computers
Wave-steering one-hot encoded FSMs
Proceedings of the 37th Annual Design Automation Conference
DATE '00 Proceedings of the conference on Design, automation and test in Europe
Input space adaptive design: a high-level methodology for energy and performance optimization
Proceedings of the 38th annual Design Automation Conference
Simultaneous peak and average power minimization during datapath scheduling for DSP processors
Proceedings of the 13th ACM Great Lakes symposium on VLSI
Wave steering to integrate logic and physical syntheses
IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special section on system-level interconnect prediction (SLIP)
A Framework for Energy and Transient Power Reduction during Behavioral Synthesis
VLSID '03 Proceedings of the 16th International Conference on VLSI Design
DATE '03 Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
Input space adaptive design: a high-level methodology for optimizing energy and performance
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Energy-efficient datapath scheduling using multiple voltages and dynamic clocking
ACM Transactions on Design Automation of Electronic Systems (TODAES)
ASP-DAC '03 Proceedings of the 2003 Asia and South Pacific Design Automation Conference
ILP models for simultaneous energy and transient power minimization during behavioral synthesis
ACM Transactions on Design Automation of Electronic Systems (TODAES)
An efficient mechanism for performance optimization of variable-latency designs
Proceedings of the 44th annual Design Automation Conference
Temperature-insensitive synthesis using multi-vt libraries
Proceedings of the 18th ACM Great Lakes symposium on VLSI
Timing-driven optimization using lookahead logic circuits
Proceedings of the 46th Annual Design Automation Conference
DynaTune: circuit-level optimization for timing speculation considering dynamic path behavior
Proceedings of the 2009 International Conference on Computer-Aided Design
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Collaborative voltage scaling with online STA and variable-latency datapath
Proceedings of the 20th symposium on Great lakes symposium on VLSI
VAIL: variation-aware issue logic and performance binning for processor yield and profit improvement
Proceedings of the 16th ACM/IEEE international symposium on Low power electronics and design
Masking timing errors on speed-paths in logic circuits
Proceedings of the Conference on Design, Automation and Test in Europe
Variable-latency design by function speculation
Proceedings of the Conference on Design, Automation and Test in Europe
Symbolic performance analysis of elastic systems
Proceedings of the International Conference on Computer-Aided Design
On logic synthesis for timing speculation
Proceedings of the International Conference on Computer-Aided Design
Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
Multispeculative additive trees in high-level synthesis
Proceedings of the Conference on Design, Automation and Test in Europe
A clock control strategy for peak power and RMS current reduction using path clustering
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
High performance reliable variable latency carry select addition
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
Hi-index | 0.04 |
This paper introduces a novel optimization paradigm for increasing the throughput of digital systems. The basic idea consists of transforming fixed-latency units into variable-latency ones that run with a faster clock cycle. The transformation is fully automatic and can be used in conjunction with traditional design techniques to improve the overall performance of speed-critical units. In addition, we introduce procedures for reducing the area overhead of the modified units, and we formulate an algorithm for automatically restructuring the controllers of the data paths in which variable-latency units have been introduced. Results, obtained on a large set of benchmark circuits, show an average throughput improvement exceeding 27%, at the price of a modest area increase (less than 8% on average)