Principles of CMOS VLSI design: a systems perspective
Principles of CMOS VLSI design: a systems perspective
Analytical energy dissipation models for low-power caches
ISLPED '97 Proceedings of the 1997 international symposium on Low power electronics and design
The filter cache: an energy efficient memory structure
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Pipeline gating: speculation control for energy reduction
Proceedings of the 25th annual international symposium on Computer architecture
Dynamic IPC/clock rate optimization
Proceedings of the 25th annual international symposium on Computer architecture
Power and performance tradeoffs using various caching strategies
ISLPED '98 Proceedings of the 1998 international symposium on Low power electronics and design
Wattch: a framework for architectural-level power analysis and optimizations
Proceedings of the 27th annual international symposium on Computer architecture
Energy-driven integrated hardware-software optimizations using SimplePower
Proceedings of the 27th annual international symposium on Computer architecture
Profile-driven code execution for low power dissipation (poster session)
ISLPED '00 Proceedings of the 2000 international symposium on Low power electronics and design
Very low power pipelines using significance compression
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Thermal Management System for High Performance PowerPCTM Microprocessors
COMPCON '97 Proceedings of the 42nd IEEE International Computer Conference
Cache designs for energy efficiency
HICSS '95 Proceedings of the 28th Hawaii International Conference on System Sciences
Complexity-effective superscalar processors
Complexity-effective superscalar processors
Inherently lower-power high-performance superscalar architectures
Inherently lower-power high-performance superscalar architectures
Low-complexity reorder buffer architecture
ICS '02 Proceedings of the 16th international conference on Supercomputing
Efficient dynamic scheduling through tag elimination
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
A large, fast instruction window for tolerating cache misses
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
A scalable instruction queue design using dependence chains
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Tradeoffs in power-efficient issue queue design
Proceedings of the 2002 international symposium on Low power electronics and design
Energy-efficient hybrid wakeup logic
Proceedings of the 2002 international symposium on Low power electronics and design
On achieving balanced power consumption in software pipelined loops
CASES '02 Proceedings of the 2002 international conference on Compilers, architecture, and synthesis for embedded systems
Microarchitecture-level power management
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Integrating Adaptive On-Chip Storage Structures for Reduced Dynamic Power
Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Energy-Efficient Design of the Reorder Buffer
PATMOS '02 Proceedings of the 12th International Workshop on Integrated Circuit Design. Power and Timing Modeling, Optimization and Simulation
Generating physical addresses directly for saving instruction TLB energy
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Power-aware issue queue design for speculative instructions
Proceedings of the 40th annual Design Automation Conference
Power-efficient issue queue design
Power aware computing
Micro-architecture design and control speculation for energy reduction
Power aware computing
Power-Aware Control Speculation through Selective Throttling
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Deterministic Clock Gating for Microprocessor Power Reduction
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Front-End Policies for Improved Issue Efficiency in SMT Processors
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Dynamic Data Dependence Tracking and its Application to Branch Prediction
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Energy efficient co-adaptive instruction fetch and issue
Proceedings of the 30th annual international symposium on Computer architecture
Positional adaptation of processors: application to energy reduction
Proceedings of the 30th annual international symposium on Computer architecture
Reducing reorder buffer complexity through selective operand caching
Proceedings of the 2003 international symposium on Low power electronics and design
Microprocessor pipeline energy analysis
Proceedings of the 2003 international symposium on Low power electronics and design
A mixed-clock issue queue design for globally asynchronous, locally synchronous processor cores
Proceedings of the 2003 international symposium on Low power electronics and design
Branch prediction on demand: an energy-efficient solution
Proceedings of the 2003 international symposium on Low power electronics and design
Exploiting compiler-generated schedules for energy savings in high-performance processors
Proceedings of the 2003 international symposium on Low power electronics and design
Comparing Program Phase Detection Techniques
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Reducing Design Complexity of the Load/Store Queue
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Energy-efficient issue queue design
IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special section on low power
State-Preserving vs. Non-State-Preserving Leakage Control in Caches
Proceedings of the conference on Design, automation and test in Europe - Volume 1
Combining compiler and runtime IPC predictions to reduce energy in next generation architectures
Proceedings of the 1st conference on Computing frontiers
Complexity-Effective Reorder Buffer Designs for Superscalar Processors
IEEE Transactions on Computers
Isolating Short-Lived Operands for Energy Reduction
IEEE Transactions on Computers
Energy Efficient Comparators for Superscalar Datapaths
IEEE Transactions on Computers
A low-power in-order/out-of-order issue queue
ACM Transactions on Architecture and Code Optimization (TACO)
Exploiting program execution phases to trade power and performance for media workload
Proceedings of the 2004 Asia and South Pacific Design Automation Conference
Dynamically Trading Frequency for Complexity in a GALS Microprocessor
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Effective Adaptive Computing Environment Management via Dynamic Optimization
Proceedings of the international symposium on Code generation and optimization
Optimizing instruction TLB energy using software and hardware techniques
ACM Transactions on Design Automation of Electronic Systems (TODAES)
An efficient wakeup design for energy reduction in high-performance superscalar processors
Proceedings of the 2nd conference on Computing frontiers
Instruction packing: reducing power and delay of the dynamic scheduling logic
ISLPED '05 Proceedings of the 2005 international symposium on Low power electronics and design
Performance directed energy management for main memory and disks
ACM Transactions on Storage (TOS)
An asymmetric clustered processor based on value content
Proceedings of the 19th annual international conference on Supercomputing
Low-power, low-complexity instruction issue using compiler assistance
Proceedings of the 19th annual international conference on Supercomputing
Reducing the Energy of Speculative Instruction Schedulers
ICCD '05 Proceedings of the 2005 International Conference on Computer Design
A New Pointer-based Instruction Queue Design and Its Power-Performance Evaluation
ICCD '05 Proceedings of the 2005 International Conference on Computer Design
Power-Efficient Wakeup Tag Broadcast
ICCD '05 Proceedings of the 2005 International Conference on Computer Design
Balancing Resource Utilization to Mitigate Power Density in Processor Pipelines
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Dynamic Resizing of Superscalar Datapath Components for Energy Efficiency
IEEE Transactions on Computers
Power reduction techniques for microprocessor systems
ACM Computing Surveys (CSUR)
Impact of virtual execution environments on processor energy consumption and hardware adaptation
Proceedings of the 2nd international conference on Virtual execution environments
Spin Detection Hardware for Improved Management of Multithreaded Systems
IEEE Transactions on Parallel and Distributed Systems
Instruction packing: Toward fast and energy-efficient instruction scheduling
ACM Transactions on Architecture and Code Optimization (TACO)
Power-efficient instruction delivery through trace reuse
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
SEED: scalable, efficient enforcement of dependences
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Energy-efficient dynamic instruction scheduling logic through instruction grouping
Proceedings of the 2006 international symposium on Low power electronics and design
Effective management of multiple configurable units using dynamic optimization
ACM Transactions on Architecture and Code Optimization (TACO)
Exploiting Operand Availability for Efficient Simultaneous Multithreading
IEEE Transactions on Computers
Hybrid-scheduling for reduced energy consumption in high-performance processors
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Using fine grain multithreading for energy efficient computing
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Unified microprocessor core storage
Proceedings of the 4th international conference on Computing frontiers
By-passing the out-of-order execution pipeline to increase energy-efficiency
Proceedings of the 4th international conference on Computing frontiers
Addressing instruction fetch bottlenecks by using an instruction register file
Proceedings of the 2007 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Cross-component energy management: Joint adaptation of processor and memory
ACM Transactions on Architecture and Code Optimization (TACO)
Architectural contesting: exposing and exploiting temperamental behavior
ACM SIGARCH Computer Architecture News - Special issue on the 2006 reconfigurable and adaptive architecture workshop
Scalable Dynamic Instruction Scheduler through Wake-Up Spatial Locality
IEEE Transactions on Computers
Building a large instruction window through ROB compression
MEDEA '07 Proceedings of the 2007 workshop on MEmory performance: DEaling with Applications, systems and architecture
Efficiency trends and limits from comprehensive microarchitectural adaptivity
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
A partitioned instruction queue to reduce instruction wakeup energy
International Journal of High Performance Computing and Networking
International Journal of High Performance Computing and Networking
Journal of Embedded Computing - Embeded Processors and Systems: Architectural Issues and Solutions for Emerging Applications
Proceedings of the 2008 ACM SIGPLAN-SIGBED conference on Languages, compilers, and tools for embedded systems
Journal of Systems Architecture: the EUROMICRO Journal
Investigating the effects of fine-grain three-dimensional integration on microarchitecture design
ACM Journal on Emerging Technologies in Computing Systems (JETC)
Streamlining long latency instructions for seamlessly combined out-of-order and in-order execution
Microprocessors & Microsystems
HeDGE: Hybrid Dataflow Graph Execution in the Issue Logic
HiPEAC '09 Proceedings of the 4th International Conference on High Performance Embedded Architectures and Compilers
Fetch Gating Control through Speculative Instruction Window Weighting
Transactions on High-Performance Embedded Architectures and Compilers II
An energy-efficient instruction scheduler design with two-level shelving and adaptive banking
Journal of Computer Science and Technology
Energy-efficient register caching with compiler assistance
ACM Transactions on Architecture and Code Optimization (TACO)
CASES '09 Proceedings of the 2009 international conference on Compilers, architecture, and synthesis for embedded systems
Fetch gating control through speculative instruction window weighting
HiPEAC'07 Proceedings of the 2nd international conference on High performance embedded architectures and compilers
Early-stage definition of LPX: a low power issue-execute processor
PACS'02 Proceedings of the 2nd international conference on Power-aware computer systems
PACS'02 Proceedings of the 2nd international conference on Power-aware computer systems
A model to exploit power-performance efficiency in superscalar processors via structure resizing
Proceedings of the 20th symposium on Great lakes symposium on VLSI
LPA: a first approach to the loop processor architecture
HiPEAC'08 Proceedings of the 3rd international conference on High performance embedded architectures and compilers
Power-aware BTB for modern processors
Computers and Electrical Engineering
A Predictive Model for Dynamic Microarchitectural Adaptivity Control
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Wake-up logic optimizations through selective match and wakeup range limitation
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Energy-efficient dynamic instruction scheduling logic through instruction grouping
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
DCG: deterministic clock-gating for low-power microprocessor design
IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special section on the 2002 international symposium on low-power electronics and design (ISLPED)
CROB: implementing a large instruction window through compression
Transactions on high-performance embedded architectures and compilers III
Implementing cryptography on TFT technology for secure display applications
CARDIS'06 Proceedings of the 7th IFIP WG 8.8/11.2 international conference on Smart Card Research and Advanced Applications
Design space navigation for neighboring power-performance efficient microprocessor configurations
ARCS'05 Proceedings of the 18th international conference on Architecture of Computing Systems conference on Systems Aspects in Organic and Pervasive Computing
Exploring the potential of architecture-level power optimizations
PACS'03 Proceedings of the Third international conference on Power - Aware Computer Systems
PACS'04 Proceedings of the 4th international conference on Power-Aware Computer Systems
A vector approach to cryptography implementation
DRMTICS'05 Proceedings of the First international conference on Digital Rights Management: technologies, Issues, Challenges and Systems
Compiler directed issue queue energy reduction
Transactions on High-Performance Embedded Architectures and Compilers IV
MLP-Aware instruction queue resizing: the key to power-efficient performance
ARCS'10 Proceedings of the 23rd international conference on Architecture of Computing Systems
Flicker: a dynamically adaptive architecture for power limited multicore systems
Proceedings of the 40th Annual International Symposium on Computer Architecture
Enhancing NBTI recovery in SRAM arrays through recovery boosting
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Revisiting reorder buffer architecture for next generation high performance computing
The Journal of Supercomputing
MLP-aware dynamic instruction window resizing for adaptively exploiting both ILP and MLP
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
Dynamic microarchitectural adaptation using machine learning
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.02 |
The issue logic of a dynamically-scheduled superscalar processor is a complex mechanism devoted to start the execution of multiple instructions every cycle. Due to its complexity, it is responsible for a significant percentage of the energy consumed by a microprocessor. The energy consumption of the issue logic depends on several architectural parameters, the instruction issue queue size being one of the most important. In this paper we present a technique to reduce the energy consumption of the issue logic of a high-performance superscalar processor. The proposed technique is based on the observation that the conventional issue logic wastes a significant amount of energy for useless activity. In particular, the wake-up of empty entries and operands that are ready represents an important source of energy waste. Besides, we propose a mechanism to dynamically reduce the effective size of the instruction queue. We show that on average the effective instruction queue size can be reduced by a factor of 26% with minimal impact on performance. This reduction together with the energy saved for empty and ready entries result in about 90.7% reduction in the energy consumed by the wake-up logic, which represents 14.9% of the total energy of the assumed processor.