MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Pipeline gating: speculation control for energy reduction
Proceedings of the 25th annual international symposium on Computer architecture
Very long instruction work architectures and the ELI-512
25 years of the international symposia on Computer architecture (selected papers)
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Lx: a technology platform for customizable VLIW embedded processing
Proceedings of the 27th annual international symposium on Computer architecture
Power-aware modulo scheduling for high-performance VLIW processors
ISLPED '01 Proceedings of the 2001 international symposium on Low power electronics and design
Low swing dual threshold voltage domino logic
Proceedings of the 12th ACM Great Lakes symposium on VLSI
Drowsy caches: simple techniques for reducing leakage power
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Exploiting VLIW schedule slacks for dynamic and leakage energy reduction
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Reducing the complexity of the register file in dynamic superscalar processors
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Cluster assignment for high-performance embedded VLIW processors
ACM Transactions on Design Automation of Electronic Systems (TODAES)
The TigerSHARC DSP Architecture
IEEE Micro
Optimizing Static Power Dissipation by Functional Units in Superscalar Processors
CC '02 Proceedings of the 11th International Conference on Compiler Construction
Managing static leakage energy in microprocessor functional units
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Adapting instruction level parallelism for optimizing leakage in VLIW architectures
Proceedings of the 2003 ACM SIGPLAN conference on Language, compiler, and tool for embedded systems
Region-based hierarchical operation partitioning for multicluster processors
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Reducing register ports using delayed write-back queues and operand pre-fetch
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Efficient Backtracking Instruction Schedulers
PACT '00 Proceedings of the 2000 International Conference on Parallel Architectures and Compilation Techniques
Energy-Efficient Register Access
SBCCI '00 Proceedings of the 13th symposium on Integrated circuits and systems design
Temperature-aware microarchitecture
Proceedings of the 30th annual international symposium on Computer architecture
Profile-Based Dynamic Voltage Scheduling Using Program Checkpoints
Proceedings of the conference on Design, automation and test in Europe
CARS: A New Code Generation Framework for Clustered ILP Processors
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Integrated temporal and spatial scheduling for extended operand clustered VLIW processors
Proceedings of the 1st conference on Computing frontiers
Power-aware compilation for register file energy reduction
International Journal of Parallel Programming - Special issue: Workshop on application specific processors (WASP)
A Graph Matching Based Integrated Scheduling Framework for Clustered VLIW Processors
ICPPW '04 Proceedings of the 2004 International Conference on Parallel Processing Workshops
Power estimation of low-power high-performance memory structures
Power estimation of low-power high-performance memory structures
MiBench: A free, commercially representative embedded benchmark suite
WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
Compiler-directed thermal management for VLIW functional units
Proceedings of the 2006 ACM SIGPLAN/SIGBED conference on Language, compilers, and tool support for embedded systems
Bypass aware instruction scheduling for register file power reduction
Proceedings of the 2006 ACM SIGPLAN/SIGBED conference on Language, compilers, and tool support for embedded systems
Decomposition of instruction decoders for low-power designs
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Power Reduction of Functional Units Considering Temperature and Process Variations
VLSID '08 Proceedings of the 21st International Conference on VLSI Design
Hi-index | 0.00 |
Clustered VLIW architectures solve the scalability problem associated with flat VLIW architectures by partitioning the register file and connecting only a subset of the functional units to a register file. However, inter-cluster communication in clustered architectures leads to increased leakage in functional components and a high number of register accesses. In this paper, we propose compiler scheduling algorithms targeting two previously ignored power-hungry components in clustered VLIW architectures, viz., instruction decoder and register file. We consider a split decoder design and propose a new energy-aware instruction scheduling algorithm that provides 14.5% and 17.3% benefit in the decoder power consumption on an average over a purely hardware based scheme in the context of 2-clustered and 4-clustered VLIW machines. In the case of register files, we propose two new scheduling algorithms that exploit limited register snooping capability to reduce extra register file accesses. The proposed algorithms reduce register file power consumption on an average by 6.85% and 11.90% (10.39% and 17.78%), respectively, along with performance improvement of 4.81% and 5.34% (9.39% and 11.16%) over a traditional greedy algorithm for 2-clustered (4-clustered) VLIW machine.