Superoptimizer: a look at the smallest program
ASPLOS II Proceedings of the second international conference on Architectual support for programming languages and operating systems
Principles of database and knowledge-base systems, Vol. I
Principles of database and knowledge-base systems, Vol. I
Code generation by a generalized neural network: general principles and elementary examples
Journal of Parallel and Distributed Computing - Neural Computing
An approach to ordering optimizing transformations
PPOPP '90 Proceedings of the second ACM SIGPLAN symposium on Principles & practice of parallel programming
Constant propagation with conditional branches
ACM Transactions on Programming Languages and Systems (TOPLAS)
Static Rate-Optimal Scheduling of Iterative Data-Flow Programs Via Optimum Unfolding
IEEE Transactions on Computers
High level synthesis of ASICs under timing and synchronization constraints
High level synthesis of ASICs under timing and synchronization constraints
Critical path minimization using retiming and algebraic speed-up
DAC '93 Proceedings of the 30th international Design Automation Conference
Power analysis of embedded software: a first step towards software power minimization
IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special issue on low-power design
Compiler transformations for high-performance computing
ACM Computing Surveys (CSUR)
High-level algorithm and architecture transformations for DSP synthesis
Journal of VLSI Signal Processing Systems - Special issue on design environments for DSP
Proceedings of the 1995 Asia and South Pacific Design Automation Conference
Asia Pacific Design Automation Conference
Design considerations and tools for low-voltage digital system design
DAC '96 Proceedings of the 33rd annual Design Automation Conference
DAC '96 Proceedings of the 33rd annual Design Automation Conference
Exploiting regularity for low-power design
Proceedings of the 1996 IEEE/ACM international conference on Computer-aided design
Power analysis and minimization techniques for embedded DSP software
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Global node reduction of linear systems using ratio analysis
ISSS '94 Proceedings of the 7th international symposium on High-level synthesis
Maximally fast and arbitrarily fast implementation of linear computations
ICCAD '92 Proceedings of the 1992 IEEE/ACM international conference on Computer-aided design
Performance optimization of sequential circuits by eliminating retiming bottlenecks
ICCAD '92 Proceedings of the 1992 IEEE/ACM international conference on Computer-aided design
Using Peephole Optimization on Intermediate Code
ACM Transactions on Programming Languages and Systems (TOPLAS)
Wireless Personal Communications: The Future of Talk
Wireless Personal Communications: The Future of Talk
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
Fast Prototyping of Datapath-Intensive Architectures
IEEE Design & Test
A Loop Transformation Theory and an Algorithm to Maximize Parallelism
IEEE Transactions on Parallel and Distributed Systems
Behavioral Synthesis for low Power
ICCS '94 Proceedings of the1994 IEEE International Conference on Computer Design: VLSI in Computer & Processors
Microarchitectural Synthesis of Performance-Constrained, Low-Power VLSI Designs
ICCS '94 Proceedings of the1994 IEEE International Conference on Computer Design: VLSI in Computer & Processors
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Activity-sensitive architectural power analysis
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Optimizing power using transformations
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Hi-index | 0.00 |
We introduce an approach for power optimization using a set of compilation and architectural techniques. The key technical innovation is a novel divide-and-conquer compilation technique to minimize the number of operations for general computations. Our technique optimizes not only a significantly wider set of computations than the previously published techniques, but also outperforms (or performs at least as well as other techniques) on all examples. Along the architectural dimension, we investigate coordinated impact of compilation techniques on the number of processors which provide optimal trade-off between cost and power. We demonstrate that proper compilation techniques can significantly reduce power with bounded hardware cost. The effectiveness of all techniques and algorithms is documented on numerous real-life designs.