The superblock: an effective technique for VLIW and superscalar compilation
The Journal of Supercomputing - Special issue on instruction-level parallelism
A comparison of full and partial predicated execution support for ILP processors
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Power considerations in the design of the Alpha 21264 microprocessor
DAC '98 Proceedings of the 35th annual Design Automation Conference
Wattch: a framework for architectural-level power analysis and optimizations
Proceedings of the 27th annual international symposium on Computer architecture
Inherently Lower-Power High-Performance Superscalar Architectures
IEEE Transactions on Computers
Power and energy reduction via pipeline balancing
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Reducing power with dynamic critical path information
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
A Fast Interrupt Handling Scheme for VLIW Processors
PACT '98 Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques
Exploiting compiler-generated schedules for energy savings in high-performance processors
Proceedings of the 2003 international symposium on Low power electronics and design
Software Directed Issue Queue Power Reduction
HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
Low-power, low-complexity instruction issue using compiler assistance
Proceedings of the 19th annual international conference on Supercomputing
Hi-index | 0.00 |
This paper develops a technique that uniquely combines the advantages of compile-time static scheduling and hardware dynamic scheduling to reduce energy consumption in dynamically scheduled processors. In this hybrid-scheduling paradigm, regions of the application containing large amounts of parallelism visible at compile-time bypass the dynamic scheduling hardware and execute in a low-power static mode. Experiments on several media and scientific benchmarks demonstrate that the proposed scheme can provide significant reduction in energy consumption with negligible performance degradation.