SIMP (Single Instruction stream/Multiple instruction Pipelining): a novel high-speed single-processor architecture

Authors:
K. Murakami;N. Irie;S. Tomita
Affiliations:
Department of Information Systems, Interdisciplinary Graduate School of Engineering Sciences, Kyushu University, Fukuoka, 816 JAPAN;Department of Information Systems, Interdisciplinary Graduate School of Engineering Sciences, Kyushu University, Fukuoka, 816 JAPAN;Department of Information Systems, Interdisciplinary Graduate School of Engineering Sciences, Kyushu University, Fukuoka, 816 JAPAN
Venue:
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Year:
1989

Citing 13
Cited 20

An Instruction Issuing Approach to Enhancing Performance in Multiple Functional Unit Processors

IEEE Transactions on Computers
A computer with low-level parallelism QA-2: its applications to 3-D graphics and Prolog/Lisp machines

ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
HPS, a new microarchitecture: rationale and introduction

MICRO 18 Proceedings of the 18th annual workshop on Microprogramming
Instruction issue logic for high-performance, interruptable pipelined processors

ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
A VLIW architecture for a trace scheduling compiler

ASPLOS II Proceedings of the second international conference on Architectual support for programming languages and operating systems
Checkpoint repair for high-performance out-of-order execution machines

IEEE Transactions on Computers
The performance potential of multiple functional unit processors

ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Software pipelining: an effective scheduling technique for VLIW machines

PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
The Cydra 5 Departmental Supercomputer: Design Philosophies, Decisions, and Trade-Offs

Computer
Implementation of precise interrupts in pipelined processors

ISCA '85 Proceedings of the 12th annual international symposium on Computer architecture
Very Long Instruction Word architectures and the ELI-512

ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
A user-microprogrammable, local host computer with low-level parallelism

ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
Instruction issue logic for pipelined supercomputers

ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture

High-bandwidth data memory systems for superscalar processors

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Architecture and implementation of a VLIW supercomputer

Proceedings of the 1990 ACM/IEEE conference on Supercomputing
A Theory of Reduced and Minimal Procedural Dependencies

IEEE Transactions on Computers
OHMEGA: a VLSI superscalar processor architecture for numerical applications

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
DSNS (dynamically-hazard-resolved statically-code-scheduled, nonuniform superscalar): yet another superscalar processor architecture

ACM SIGARCH Computer Architecture News
Limits of control flow on parallelism

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
The expandable split window paradigm for exploiting fine-grain parallelsim

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
An elementary processor architecture with simultaneous instruction issuing from multiple threads

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Register traffic analysis for streamlining inter-operation communication in fine-grain parallel processors

MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Performance analysis and design methodology for a scalable superscalar architecture

MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Enhanced superscalar hardware: the schedule table

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Unconstrained speculative execution with predicated state buffering

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Performance comparison of ILP machines with cycle time evaluation

ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
The 16-fold way: a microparallel taxonomy

MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
Control independence in trace processors

Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Boosting beyond static scheduling in a superscalar processor

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Toward Advanced Parallel Processing: Exploiting Parallelism at Task and Instruction Levels

IEEE Micro
Efficient Exploitation of Instruction-Level Parallelism for Superscalar Processors by the Conjugate Register File Scheme

IEEE Transactions on Computers
Configuring a real time radio signal processor on an embedded system using compiled XML

SIP '07 Proceedings of the Ninth IASTED International Conference on Signal and Image Processing
Invasive computing in HPC with X10

Proceedings of the third ACM SIGPLAN X10 Workshop

Quantified Score

Hi-index	0.01

Visualization

Abstract

SIMP is a novel multiple instruction-pipeline parallel architecture. It is targeted for enhancing the performance of SISD processors drastically by exploiting both temporal and spatial parallelisms, and for keeping program compatibility as well. Degree of performance enhancement achieved by SIMP depends on; i) how to supply multiple instructions continuously, and ii) how to resolve data and control dependencies effectively. We have devised the outstanding techniques for instruction fetch and dependency resolution. The instruction fetch mechanism employs unique schemes of; i) prefetching multiple instructions with the help of branch prediction, ii) squashing instructions selectively, and iii) providing multiple conditional modes as a result. The dependency resolution mechanism permits out-of-order execution of sequential instruction stream. Our out-of-order execution model is based on Tomasulo's algorithm which has been used in single instruction-pipeline processors. However, it is greatly extended and accommodated to multiple instruction pipelining with; i) detecting and identifying multiple dependencies simultaneously, ii) alleviating the effects of control dependencies with both eager execution and advance execution, and iii) ensuring a precise machine state against branches and interrupts. By taking advantage of these techniques, SIMP is one of the most promising architectures toward the coming generation of high-speed single processors.