Instruction issue logic for high-performance, interruptable pipelined processors

Authors:
G. S. Sohi;S. Vajapeyam
Affiliations:
Computer Sciences Department, University of Wisconsin-Madison;Computer Sciences Department, University of Wisconsin-Madison
Venue:
ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
Year:
1987

Citing 8
Cited 46

HPSm, a high performance restricted data flow architecture having minimal functionality

ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Highly concurrent scalar processing

ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
WISQ: a restartable architecture using queues

ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
Implementation of precise interrupts in pipelined processors

ISCA '85 Proceedings of the 12th annual international symposium on Computer architecture
The CRAY-1 computer system

Communications of the ACM - Special issue on computer architecture
Hardware/software tradeoffs for increased performance

ASPLOS I Proceedings of the first international symposium on Architectural support for programming languages and operating systems
A study of branch prediction strategies

ISCA '81 Proceedings of the 8th annual symposium on Computer Architecture
Instruction issue logic for pipelined supercomputers

ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture

The performance potential of multiple functional unit processors

ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Limits on multiple instruction issue

ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
SIMP (Single Instruction stream/Multiple instruction Pipelining): a novel high-speed single-processor architecture

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
SIMP (Single Instruction stream/Multiple instruction Pipelining): a novel high-speed single-processor architecture

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
The Nonuniform Distribution of Instruction-Level and Machine Parallelism and its Effect on Performance

IEEE Transactions on Computers
Instruction Issue Logic for High-Performance, Interruptible, Multiple Functional Unit, Pipelined Computers

IEEE Transactions on Computers
Single instruction stream parallelism is greater than two

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
DSNS (dynamically-hazard-resolved statically-code-scheduled, nonuniform superscalar): yet another superscalar processor architecture

ACM SIGARCH Computer Architecture News
Comparing static and dynamic code scheduling for multiple-instruction-issue processors

MICRO 24 Proceedings of the 24th annual international symposium on Microarchitecture
Limits of control flow on parallelism

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Performance analysis and design methodology for a scalable superscalar architecture

MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
The MC88110 implementation of precise exceptions in a superscalar architecture

ACM SIGARCH Computer Architecture News
Enhanced superscalar hardware: the schedule table

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Handling floating-point exceptions in numeric programs

ACM Transactions on Programming Languages and Systems (TOPLAS)
A comparative performance evaluation of various state maintenance mechanisms

MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
Dynamically scheduled VLIW processors

MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
A comparision of superscalar and decoupled access/execute architectures

MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
Retrospective: multiscalar processors

25 years of the international symposia on Computer architecture (selected papers)
Improving prediction for procedure returns with return-address-stack repair mechanisms

MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
A look at several memory management units, TLB-refill mechanisms, and page table organizations

Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Branch Prediction, Instruction-Window Size, and Cache Size: Performance Trade-Offs and Simulation Techniques

IEEE Transactions on Computers
Wattch: a framework for architectural-level power analysis and optimizations

Proceedings of the 27th annual international symposium on Computer architecture
Value-based clock gating and operation packing: dynamic strategies for improving processor power and performance

ACM Transactions on Computer Systems (TOCS)
Eager writeback - a technique for improving bandwidth utilization

Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Instruction distribution heuristics for quad-cluster, dynamically-scheduled, superscalar processors

Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Handling long-latency loads in a simultaneous multithreading processor

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Architectural differences of efficient sequential and parallel computers

Journal of Systems Architecture: the EUROMICRO Journal
Interrupt Processing in Concurrent Processors

Computer
The Design Space of Register Renaming Techniques

IEEE Micro
Instruction Window Size Trade-Offs and Characterization of Program Parallelism

IEEE Transactions on Computers
Efficient Exploitation of Instruction-Level Parallelism for Superscalar Processors by the Conjugate Register File Scheme

IEEE Transactions on Computers
Improving the Precise Interrupt Mechanism of Software-Managed TLB Miss Handlers

HiPC '01 Proceedings of the 8th International Conference on High Performance Computing
A Comparison of Two Architectural Power Models

PACS '00 Proceedings of the First International Workshop on Power-Aware Computer Systems-Revised Papers
Microprocessors - 10 Years Back, 10 Years Ahead

Informatics - 10 Years Back. 10 Years Ahead.
Combining compiler and runtime IPC predictions to reduce energy in next generation architectures

Proceedings of the 1st conference on Computing frontiers
Complexity-Effective Reorder Buffer Designs for Superscalar Processors

IEEE Transactions on Computers
A First-Order Superscalar Processor Model

Proceedings of the 31st annual international symposium on Computer architecture
In-Line Interrupt Handling and Lock-Up Free Translation Lookaside Buffers (TLBs)

IEEE Transactions on Computers
A comparison of two policies for issuing instructions speculatively

Journal of Systems Architecture: the EUROMICRO Journal
Unified microprocessor core storage

Proceedings of the 4th international conference on Computing frontiers
Forwardflow: a scalable core for power-constrained CMPs

Proceedings of the 37th annual international symposium on Computer architecture
Task superscalar: using processors as functional units

HotPar'10 Proceedings of the 2nd USENIX conference on Hot topics in parallelism
Idempotent processor architecture

Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
MLP-Aware instruction queue resizing: the key to power-efficient performance

ARCS'10 Proceedings of the 23rd international conference on Architecture of Computing Systems
Criticality guided energy aware speculation for speculative multithreaded processors

Parallel Computing
iGPU: exception support and speculative execution on GPUs

Proceedings of the 39th Annual International Symposium on Computer Architecture

Quantified Score

Hi-index	0.02

Visualization

Abstract

The performance of pipelined processors is severely limited by data dependencies. In order to achieve high performance, a mechanism to alleviate the effects of data dependencies must exist. If a pipelined CPU with multiple functional units is to be used in the presence of a virtual memory hierarchy, a mechanism must also exist for determining the state of the machine precisely. In this paper, we combine the issues of dependency-resolution and preciseness of state. We present a design for instruction issue logic that resolves dependencies dynamically and, at the same time, guarantees a precise state of the machine, without a significant hardware overhead. Detailed simulation studies for the proposed mechanism, using the Lawrence Livermore loops as a benchmark, are presented.