A portable global optimizer and linker
PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
A 160-MHz, 32-b, 0.5-W CMOS RISC microprocessor
Digital Technical Journal
Pipeline gating: speculation control for energy reduction
Proceedings of the 25th annual international symposium on Computer architecture
ISLPED '99 Proceedings of the 1999 international symposium on Low power electronics and design
Filtering Memory References to Increase Energy Efficiency
IEEE Transactions on Computers
Wattch: a framework for architectural-level power analysis and optimizations
Proceedings of the 27th annual international symposium on Computer architecture
ACM Computing Surveys (CSUR)
Using dynamic cache management techniques to reduce energy in general purpose processors
IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special issue on system-level interconnect prediction
Instruction flow-based front-end throttling for power-aware high-performance processors
ISLPED '01 Proceedings of the 2001 international symposium on Low power electronics and design
DSP Processors Hit the Mainstream
Computer
High Performance and Energy Efficient Serial Prefetch Architecture
ISHPC '02 Proceedings of the 4th International Symposium on High Performance Computing
Power-Aware Control Speculation through Selective Throttling
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Applying Decay Strategies to Branch Predictors for Leakage Energy Savings
ICCD '02 Proceedings of the 2002 IEEE International Conference on Computer Design: VLSI in Computers and Processors (ICCD'02)
Cyclone: a broadcast-free dynamic instruction scheduler with selective replay
Proceedings of the 30th annual international symposium on Computer architecture
Power Issues Related to Branch Prediction
HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
IEEE Transactions on Computers
MiBench: A free, commercially representative embedded benchmark suite
WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
Power efficient branch prediction through early identification of branch addresses
CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Designing a practical data filter cache to improve both energy efficiency and performance
ACM Transactions on Architecture and Code Optimization (TACO)
Reducing instruction fetch energy in multi-issue processors
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.00 |
Instruction fetch behavior has been shown to be very regular and predictable, even for diverse application areas. In this work, we propose the Lookahead Instruction Fetch Engine (LIFE), which is designed to exploit the regularity present in instruction fetch. The nucleus of LIFE is the Tagless Hit Instruction Cache (TH-IC), a small cache that assists the instruction fetch pipeline stage as it efficiently captures information about both sequential and non-sequential transitions between instructions. TH-IC provides a considerable savings in fetch energy without incurring the performance penalty normally associated with small filter instruction caches. LIFE extends TH-IC by making use of advanced control flow metadata to further improve utilization of fetch-associated structures such as the branch predictor, branch target buffer, and return address stack. These structures are selectively disabled by LIFE when it can be determined that they are unnecessary for the following instruction to be fetched. Our results show that LIFE enables further reductions in total processor energy consumption with no impact on application execution times even for the most aggressive power-saving configuration. We also explore the use of LIFE metadata on guiding decisions further down the pipeline. Next sequential line prefetch for the data cache can be enhanced by only prefetching when the triggering instruction has been previously accessed in the TH-IC. This strategy reduces the number of useless prefetches and thus contributes to improving overall processor efficiency. LIFE enables designers to boost instruction fetch efficiency by reducing energy cost without negatively affecting performance.