An Instruction Issuing Approach to Enhancing Performance in Multiple Functional Unit Processors

Authors:
R D Acosta;J Kjelstrup;H C Torng
Affiliations:
-;-;-
Venue:
IEEE Transactions on Computers
Year:
1986

Citing 24
Cited 41

Supercomputing

Computer - IEEE Centennial: the state of computing
The C programming language

The C programming language
The Parallel Evaluation of General Arithmetic Expressions

Journal of the ACM (JACM)
Look-Ahead Processors

ACM Computing Surveys (CSUR)
A Survey of Parallel Machine Organization and Programming

ACM Computing Surveys (CSUR)
Pipeline Architecture

ACM Computing Surveys (CSUR)
The CRAY-1 computer system

Communications of the ACM - Special issue on computer architecture
Peephole optimization

Communications of the ACM
Compiler Construction

Compiler Construction
Eliminating redundant object code

POPL '82 Proceedings of the 9th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Analysis of Cray-1S architecture

ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
The 801 minicomputer

ASPLOS I Proceedings of the first international symposium on Architectural support for programming languages and operating systems
Instruction issue logic for pipelined supercomputers

ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
Representation and detection of concurrency using ordering-matrices.

Representation and detection of concurrency using ordering-matrices.
Evaluation, implementation, and enhancement of the dispatch stack instruction issuing mechanism (computer, architecture, scheduling)

Evaluation, implementation, and enhancement of the dispatch stack instruction issuing mechanism (computer, architecture, scheduling)
Principles of Compiler Design (Addison-Wesley series in computer science and information processing)

Principles of Compiler Design (Addison-Wesley series in computer science and information processing)
Design of a Computer—The Control Data 6600

Design of a Computer—The Control Data 6600
Detection and Parallel Execution of Independent Instructions

IEEE Transactions on Computers
On the Number of Operations Simultaneously Executable in Fortran-Like Programs and Their Resulting Speedup

IEEE Transactions on Computers
A VLSI RISC

Computer
An Approach to Scientific Array Processing: The Architectural Design of the AP-120B/FPS-164 Family

Computer
The "single-assignment" approach to parallel processing

AFIPS '71 (Fall) Proceedings of the November 16-18, 1971, fall joint computer conference
The IBM system/360 model 91: machine philosophy and instruction-handling

IBM Journal of Research and Development
An efficient algorithm for exploiting multiple arithmetic units

IBM Journal of Research and Development

A VLIW architecture for a trace scheduling compiler

ASPLOS II Proceedings of the second international conference on Architectual support for programming languages and operating systems
A VLIW architecture for a trace Scheduling Compiler

IEEE Transactions on Computers - Special issue on architectural support for programming languages and operating systems
The performance potential of multiple functional unit processors

ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
On the combination of hardware and software concurrency extraction methods

ACM SIGMICRO Newsletter
I-NET mechanism for issuing multiple instructions

Proceedings of the 1988 ACM/IEEE conference on Supercomputing
Available instruction-level parallelism for superscalar and superpipelined machines

ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Limits on multiple instruction issue

ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
SIMP (Single Instruction stream/Multiple instruction Pipelining): a novel high-speed single-processor architecture

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
SIMP (Single Instruction stream/Multiple instruction Pipelining): a novel high-speed single-processor architecture

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Forward semantic: a compiler-assisted instruction fetch method for heavily pipelined processors

MICRO 22 Proceedings of the 22nd annual workshop on Microprogramming and microarchitecture
The Nonuniform Distribution of Instruction-Level and Machine Parallelism and its Effect on Performance

IEEE Transactions on Computers
Instruction Issue Logic for High-Performance, Interruptible, Multiple Functional Unit, Pipelined Computers

IEEE Transactions on Computers
A Theory of Reduced and Minimal Procedural Dependencies

IEEE Transactions on Computers
IMPACT: an architectural framework for multiple-instruction-issue processors

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Comparing static and dynamic code scheduling for multiple-instruction-issue processors

MICRO 24 Proceedings of the 24th annual international symposium on Microarchitecture
Distributed Instruction Set Computer Architecture

IEEE Transactions on Computers
Effects of building blocks on the performance of super-scalar architecture

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Concurrency Extraction Via Hardware Methods Executing the Static Instruction Stream

IEEE Transactions on Computers
On the attributes of the SCISM organization

ACM SIGARCH Computer Architecture News
Interlock collapsing ALU for increased instruction-level parallelism

MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Performance analysis and design methodology for a scalable superscalar architecture

MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
An out-of-order superscalar processor with speculative execution and fast, precise interrupts

MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Shared memory consistency conditions for non-sequential execution: definitions and programming strategies

SPAA '93 Proceedings of the fifth annual ACM symposium on Parallel algorithms and architectures
Enhanced superscalar hardware: the schedule table

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Evaluating Performance Tradeoffs Between Fine-Grained and Coarse-Grained Alternatives

IEEE Transactions on Parallel and Distributed Systems
A macrotask-level unlimited speculative execution on multiprocessors

ICS '95 Proceedings of the 9th international conference on Supercomputing
Dynamically scheduled VLIW processors

MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
On the combination of hardware and software concurrency extraction methods

MICRO 20 Proceedings of the 20th annual workshop on Microprogramming
IMPACT: an architectural framework for multiple-instruction-issue processors

25 years of the international symposia on Computer architecture (selected papers)
Control flow optimization for supercomputer scalar processing

ICS '89 Proceedings of the 3rd international conference on Supercomputing
Multiple instruction issue in the NonStop cyclone processor

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Superscalar Instruction Issue

IEEE Micro
Efficient Instruction Sequencing with Inline Target Insertion

IEEE Transactions on Computers
Interrupt Handling for Out-of-Order Execution Processors

IEEE Transactions on Computers
Interlock Collapsing ALU's

IEEE Transactions on Computers
High-Performance 3-1 Interlock Collapsing ALU's

IEEE Transactions on Computers
Instruction Window Size Trade-Offs and Characterization of Program Parallelism

IEEE Transactions on Computers
Counterflow Pipeline Based Dynamic Instruction Scheduling

ASYNC '96 Proceedings of the 2nd International Symposium on Advanced Research in Asynchronous Circuits and Systems
Implementation Register Interlocks in Parallel-Pipeline, Multiple Instruction Queue, Superscalar Processors

HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Proof of correctness of high-performance 3-1 interlock collapsing ALUs

IBM Journal of Research and Development
IBM-ACS: reminiscences and lessons learned from a 1960's supercomputer project

Dependable and Historic Computing

Quantified Score

Hi-index	15.01

Visualization

Abstract

Processors with multiple functional units, such as CRAY-1, Cyber 205, and FPS 164, have been used for high-end scientific computation tasks. Much effort has been put into increasing the throughput of such systems. One critical consideration in their design is the identification and implementation of a suitable instruction issuing scheme. Existing approaches do not issue enough instructions per machine cycle to fully utilize the functional units and realize the high-performance level achievable with these powerful execution resources.