Instruction Issue Logic for High-Performance, Interruptible, Multiple Functional Unit, Pipelined Computers

Authors:
Gurindar S. Sohi
Affiliations:
Univ. of Wisconsin, Madison
Venue:
IEEE Transactions on Computers
Year:
1990

Citing 17
Cited 76

An Instruction Issuing Approach to Enhancing Performance in Multiple Functional Unit Processors

IEEE Transactions on Computers
HPSm, a high performance restricted data flow architecture having minimal functionality

ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Highly concurrent scalar processing

ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Reducing the cost of branches

ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
HPS, a new microarchitecture: rationale and introduction

MICRO 18 Proceedings of the 18th annual workshop on Microprogramming
Critical issues regarding HPS, a high performance microarchitecture

MICRO 18 Proceedings of the 18th annual workshop on Microprogramming
Instruction issue logic for high-performance, interruptable pipelined processors

ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
WISQ: a restartable architecture using queues

ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
Architectural tradeoffs in the design of MIPS-X

ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
Checkpoint repair for high-performance out-of-order execution machines

IEEE Transactions on Computers
Implementing Precise Interrupts in Pipelined Processors

IEEE Transactions on Computers
The performance potential of multiple functional unit processors

ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Characterizing computer performance with a single number

Communications of the ACM
Look-Ahead Processors

ACM Computing Surveys (CSUR)
The CRAY-1 computer system

Communications of the ACM - Special issue on computer architecture
Hardware/software tradeoffs for increased performance

ASPLOS I Proceedings of the first international symposium on Architectural support for programming languages and operating systems
A study of branch prediction strategies

ISCA '81 Proceedings of the 8th annual symposium on Computer Architecture

The Evolution of Instruction Sequencing

Computer - Special issue on instruction sequencing
High-bandwidth data memory systems for superscalar processors

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
OHMEGA: a VLSI superscalar processor architecture for numerical applications

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
An architectural framework for migration from CISC to higher performance platforms

ICS '92 Proceedings of the 6th international conference on Supercomputing
An out-of-order superscalar processor with speculative execution and fast, precise interrupts

MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Enhanced superscalar hardware: the schedule table

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
A fill-unit approach to multiple instruction issue

MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Compiler transformations for high-performance computing

ACM Computing Surveys (CSUR)
Direct-mapped versus set-associative pipelined caches

PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
Zero-cycle loads: microarchitecture support for reducing load latency

Proceedings of the 28th annual international symposium on Microarchitecture
ARB: A Hardware Mechanism for Dynamic Reordering of Memory References

IEEE Transactions on Computers
Memory bandwidth limitations of future microprocessors

ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
A study on the number of memory ports in multiple instruction issue machines

MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
DataScalar architectures

Proceedings of the 24th annual international symposium on Computer architecture
Micro-preemption synthesis: an enabling mechanism for multi-task VLSI systems

ICCAD '97 Proceedings of the 1997 IEEE/ACM international conference on Computer-aided design
On high-bandwidth data cache design for multi-issue processors

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Load latency tolerance in dynamically scheduled processors

MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
A dynamic scheduling logic for exploiting multiple functional units in single chip multithreaded architectures

Proceedings of the 1999 ACM symposium on Applied computing
Decoupling local variable accesses in a wide-issue superscalar processor

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Dynamic removal of redundant computations

ICS '99 Proceedings of the 13th international conference on Supercomputing
Access region locality for high-bandwidth processor memory system design

Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Design Alternatives of Multithreaded Architecture

International Journal of Parallel Programming
Push vs. pull: data movement for linked data structures

Proceedings of the 14th international conference on Supercomputing
Transient fault detection via simultaneous multithreading

Proceedings of the 27th annual international symposium on Computer architecture
CHIMAERA: a high-performance architecture with a tightly-coupled reconfigurable functional unit

Proceedings of the 27th annual international symposium on Computer architecture
Clock rate versus IPC: the end of the road for conventional microarchitectures

Proceedings of the 27th annual international symposium on Computer architecture
Performance improvement with circuit-level speculation

Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
A time-stamping algorithm for efficient performance estimation of superscalar processors

Proceedings of the 2001 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Locality vs. criticality

ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Measuring experimental error in microprocessor simulation

SSR '01 Proceedings of the 2001 symposium on Software reusability: putting software reuse in context
A High-Bandwidth Memory Pipeline for Wide Issue Processors

IEEE Transactions on Computers
Designing a Modern Memory Hierarchy with Hardware Prefetching

IEEE Transactions on Computers
A large, fast instruction window for tolerating cache misses

ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Detailed design and evaluation of redundant multithreading alternatives

ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Adapting Tomasulo's algorithm for bytecode folding based Java processors

ACM SIGARCH Computer Architecture News - Special Issue: PACT 2001 workshops
Dual use of superscalar datapath for transient-fault detection and recovery

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Measuring Experimental Error in Microprocessor Simulation

ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
An Architectural Framework for Supporting Heterogeneous Instruction-Set Architectures

Computer
The Metaflow Architecture

IEEE Micro
The PowerPC 604 RISC microprocessor

IEEE Micro
Limited Bandwidth to Affect Processor Design

IEEE Micro
Interrupt Handling for Out-of-Order Execution Processors

IEEE Transactions on Computers
Selective Register Renaming: A Compiler-Driven Approach to Dynamic Register Renaming

HPCN Europe 2001 Proceedings of the 9th International Conference on High-Performance Computing and Networking
Influence of Compiler Optimizations on Value Prediction

HPCN Europe 2001 Proceedings of the 9th International Conference on High-Performance Computing and Networking
Augmenting Modern Superscalar Architectures with Configurable Extended Instructions

IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
Compiler-Directed Dynamic Frequency and Voltage Scheduling

PACS '00 Proceedings of the First International Workshop on Power-Aware Computer Systems-Revised Papers
Low-Cost Value Predictors Using Frequent Value Locality

ISHPC '02 Proceedings of the 4th International Symposium on High Performance Computing
Decoupling Recovery Mechanism for Data Speculation from Dynamic Instruction Scheduling Structure

Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Value Prediction as a Cost-Effective Solution to Improve Embedded Processors Performance

VECPAR '00 Selected Papers and Invited Talks from the 4th International Conference on Vector and Parallel Processing
DRAM-Page Based Prediction and Prefetching

ICCD '00 Proceedings of the 2000 IEEE International Conference on Computer Design: VLSI in Computers & Processors
Partial Resolution in Data Value Predictors

ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
Balancing Reuse Opportunities and Performance Gains with Subblock Value Reuse

IEEE Transactions on Computers
Constructive timing violation for improving energy efficiency

Compilers and operating systems for low power
Analysis of the impact of different methods for division/square root computation in the performance of a superscalar microprocessor

Journal of Systems Architecture: the EUROMICRO Journal - Special issue: Synthesis and verification
From Sequences of Dependent Instructions to Functions: An Approach for Improving Performance without ILP or Speculation

Proceedings of the 31st annual international symposium on Computer architecture
A scalable, clustered SMT processor for digital signal processing

MEDEA '03 Proceedings of the 2003 workshop on MEmory performance: DEaling with Applications , systems and architecture
Coherence decoupling: making use of incoherence

ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Tolerating memory latency through push prefetching for pointer-intensive applications

ACM Transactions on Architecture and Code Optimization (TACO)
Improving Energy-Efficiency by Bypassing Trivial Computations

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 11 - Volume 12
Recent extensions to the SimpleScalar tool suite

ACM SIGMETRICS Performance Evaluation Review - Special issue on tools for computer architecture research
Memory State Compressors for Giga-Scale Checkpoint/Restore

Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Modeling out-of-order processors for WCET analysis

Real-Time Systems
BranchTap: improving performance with very few checkpoints through adaptive speculation control

Proceedings of the 20th annual international conference on Supercomputing
Speculative trivialization point advancing in high-performance processors

Journal of Systems Architecture: the EUROMICRO Journal
SimWattch: Integrating Complete-System and User-Level Performance and Power Simulators

IEEE Micro
Microarchitecture configurations and floorplanning co-optimization

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
A distributed, simultaneously multi-threaded (SMT) processor with clustered scheduling windows for scalable DSP performance

Journal of Signal Processing Systems - Special Issue: Embedded computing systems for DSP
An energy consumption characterization of on-chip interconnection networks for tiled CMP architectures

The Journal of Supercomputing
The performance of pollution control victim cache for embedded systems

Proceedings of the 21st annual symposium on Integrated circuits and system design
The impact of speculative execution on SMT processors

International Journal of Parallel Programming
Access region cache with register guided memory reference partitioning

Journal of Systems Architecture: the EUROMICRO Journal
On the design of a register queue based processor architecture (FaRM-rq)

ISPA'03 Proceedings of the 2003 international conference on Parallel and distributed processing and applications
Analysis of x86 ISA condition codes influence on superscalar execution

HiPC'07 Proceedings of the 14th international conference on High performance computing
Turbo-ROB: a low cost checkpoint/restore accelerator

HiPEAC'08 Proceedings of the 3rd international conference on High performance embedded architectures and compilers
Comparing FPGA vs. custom cmos and the impact on processor microarchitecture

Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays
Speculative issue logic

ACSAC'05 Proceedings of the 10th Asia-Pacific conference on Advances in Computer Systems Architecture

Quantified Score

Hi-index	15.00

Visualization

Abstract

The problems of data dependency resolution and precise interrupt implementation in pipelined processors are combined. A design for a hardware mechanism that resolves dependencies dynamically and, at the same time, guarantees precise interrupts is presented. Simulation studies show that by resolving dependencies the proposed mechanism is able to obtain a significant speedup over a simple instruction issue mechanism as well as implement precise interrupts.