The multiscalar architecture
Simultaneous multithreading: maximizing on-chip parallelism
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Increasing superscalar performance through multistreaming
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
Complexity-effective superscalar processors
Proceedings of the 24th annual international symposium on Computer architecture
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
The multicluster architecture: reducing cycle time through partitioning
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Speculative multithreaded processors
ICS '98 Proceedings of the 12th international conference on Supercomputing
Multiple-banked register file architectures
Proceedings of the 27th annual international symposium on Computer architecture
Very Long Instruction Word architectures and the ELI-512
ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
Superscalar Execution with Direct Data Forwarding
PACT '98 Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques
A circuit level implementation of an adaptive issue queue for power-aware microprocessors
GLSVLSI '01 Proceedings of the 11th Great Lakes symposium on VLSI
Reducing the complexity of the issue logic
ICS '01 Proceedings of the 15th international conference on Supercomputing
Energy reduction in queues and stacks by adaptive bitwidth compression
ISLPED '01 Proceedings of the 2001 international symposium on Low power electronics and design
Dual path instruction processing
ICS '02 Proceedings of the 16th international conference on Supercomputing
A large, fast instruction window for tolerating cache misses
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
A scalable instruction queue design using dependence chains
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Select-free instruction scheduling logic
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
A high-speed dynamic instruction scheduling scheme for superscalar processors
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Energy-efficient hybrid wakeup logic
Proceedings of the 2002 international symposium on Low power electronics and design
An Adaptive Issue Queue for Reduced Power at High Performance
PACS '00 Proceedings of the First International Workshop on Power-Aware Computer Systems-Revised Papers
Improving quasi-dynamic schedules through region slip
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Power-efficient issue queue design
Power aware computing
Front-End Policies for Improved Issue Efficiency in SMT Processors
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
A mixed-clock issue queue design for globally asynchronous, locally synchronous processor cores
Proceedings of the 2003 international symposium on Low power electronics and design
Macro-op Scheduling: Relaxing Scheduling Loop Constraints
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Energy-efficient issue queue design
IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special section on low power
Using Dynamic Binary Translation to Fuse Dependent Instructions
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Scaling the issue window with look-ahead latency prediction
Proceedings of the 18th annual international conference on Supercomputing
A time-predictable execution mode for superscalar pipelines with instruction prescheduling
Proceedings of the 2nd conference on Computing frontiers
Instruction packing: reducing power and delay of the dynamic scheduling logic
ISLPED '05 Proceedings of the 2005 international symposium on Low power electronics and design
Low-power, low-complexity instruction issue using compiler assistance
Proceedings of the 19th annual international conference on Supercomputing
A New Pointer-based Instruction Queue Design and Its Power-Performance Evaluation
ICCD '05 Proceedings of the 2005 International Conference on Computer Design
Power-Efficient Wakeup Tag Broadcast
ICCD '05 Proceedings of the 2005 International Conference on Computer Design
Instruction packing: Toward fast and energy-efficient instruction scheduling
ACM Transactions on Architecture and Code Optimization (TACO)
SEED: scalable, efficient enforcement of dependences
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Scientific applications vs. SPEC-FP: a comparison of program behavior
Proceedings of the 20th annual international conference on Supercomputing
Exploiting Operand Availability for Efficient Simultaneous Multithreading
IEEE Transactions on Computers
By-passing the out-of-order execution pipeline to increase energy-efficiency
Proceedings of the 4th international conference on Computing frontiers
Scalable Dynamic Instruction Scheduler through Wake-Up Spatial Locality
IEEE Transactions on Computers
A low-complexity microprocessor design with speculative pre-execution
Journal of Systems Architecture: the EUROMICRO Journal
Non-uniform instruction scheduling
Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Instruction recirculation: eliminating counting logic in wakeup-free schedulers
Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
PACS'04 Proceedings of the 4th international conference on Power-Aware Computer Systems
Hi-index | 0.01 |
One of the main concerns in today's processor design is the issue logic. Instruction-level parallelism is usually favored by an out-of-order issue mechanism where instructions can issue independently of the program order. The out-of-order scheme yields the best performance but at the same time introduces important hardware costs such as an associative look-up, which might be prohibitive for wide issue processors with large instruction windows. This associative search may slow-down the clock-rate and it has an important impact on power consumption. In this work, two new issue schemes that reduce the hardware complexity of the issue logic with minimal impact on the average number of instructions executed per cycle are presented.