The multiscalar architecture
Simultaneous multithreading: maximizing on-chip parallelism
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Increasing superscalar performance through multistreaming
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
Complexity-effective superscalar processors
Proceedings of the 24th annual international symposium on Computer architecture
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
The multicluster architecture: reducing cycle time through partitioning
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Power considerations in the design of the Alpha 21264 microprocessor
DAC '98 Proceedings of the 35th annual Design Automation Conference
Speculative multithreaded processors
ICS '98 Proceedings of the 12th international conference on Supercomputing
Proceedings of the 14th international conference on Supercomputing
Circuits for wide-window superscalar processors
Proceedings of the 27th annual international symposium on Computer architecture
Multiple-banked register file architectures
Proceedings of the 27th annual international symposium on Computer architecture
Very Long Instruction Word architectures and the ELI-512
ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
Superscalar Execution with Direct Data Forwarding
PACT '98 Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques
Data-Flow Prescheduling for Large Instruction Windows in Out-of-Order Processors
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Inherently lower-power high-performance superscalar architectures
Inherently lower-power high-performance superscalar architectures
Efficient dynamic scheduling through tag elimination
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
A large, fast instruction window for tolerating cache misses
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
A scalable instruction queue design using dependence chains
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Improving quasi-dynamic schedules through region slip
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Using Dynamic Binary Translation to Fuse Dependent Instructions
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
A low-power in-order/out-of-order issue queue
ACM Transactions on Architecture and Code Optimization (TACO)
Late Allocation and Early Release of Physical Registers
IEEE Transactions on Computers
Instruction packing: reducing power and delay of the dynamic scheduling logic
ISLPED '05 Proceedings of the 2005 international symposium on Low power electronics and design
Power-Efficient Wakeup Tag Broadcast
ICCD '05 Proceedings of the 2005 International Conference on Computer Design
Instruction packing: Toward fast and energy-efficient instruction scheduling
ACM Transactions on Architecture and Code Optimization (TACO)
Energy-efficient dynamic instruction scheduling logic through instruction grouping
Proceedings of the 2006 international symposium on Low power electronics and design
Exploiting Operand Availability for Efficient Simultaneous Multithreading
IEEE Transactions on Computers
Using fine grain multithreading for energy efficient computing
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Scalable Dynamic Instruction Scheduler through Wake-Up Spatial Locality
IEEE Transactions on Computers
Hiding the misprediction penalty of a resource-efficient high-performance processor
ACM Transactions on Architecture and Code Optimization (TACO)
A low-complexity microprocessor design with speculative pre-execution
Journal of Systems Architecture: the EUROMICRO Journal
HeDGE: Hybrid Dataflow Graph Execution in the Issue Logic
HiPEAC '09 Proceedings of the 4th International Conference on High Performance Embedded Architectures and Compilers
Accurate Instruction Pre-scheduling in Dynamically Scheduled Processors
Transactions on High-Performance Embedded Architectures and Compilers II
Exploring the limits of early register release: Exploiting compiler analysis
ACM Transactions on Architecture and Code Optimization (TACO)
Wake-up logic optimizations through selective match and wakeup range limitation
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Energy-efficient dynamic instruction scheduling logic through instruction grouping
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Non-uniform instruction scheduling
Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Instruction recirculation: eliminating counting logic in wakeup-free schedulers
Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
PACS'04 Proceedings of the 4th international conference on Power-Aware Computer Systems
Compiler directed issue queue energy reduction
Transactions on High-Performance Embedded Architectures and Compilers IV
Hi-index | 0.01 |
The issue logic of dynamically scheduled superscalar processors is one of their most complex and power-consuming parts. In this paper we present alternative issue-logic designs that are much simpler than the traditional scheme while they retain most of its ability to exploit ILP. These alternative schemes are based on the observation that most values produced by a program are used by very few instructions, and the latencies of most operation are deterministic.