Multiple instruction issue in the NonStop cyclone processor

Authors:
Robert W. Horst;Richard L. Harris;Robert L. Jardine
Affiliations:
Tandem Computers Incorporated, 19333 Vallco Parkway, Cupertino, CA;Tandem Computers Incorporated, 19333 Vallco Parkway, Cupertino, CA;Tandem Computers Incorporated, 19333 Vallco Parkway, Cupertino, CA
Venue:
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Year:
1990

Citing 4
Cited 17

An Instruction Issuing Approach to Enhancing Performance in Multiple Functional Unit Processors

IEEE Transactions on Computers
A VLIW architecture for a trace scheduling compiler

ASPLOS II Proceedings of the second international conference on Architectual support for programming languages and operating systems
Available instruction-level parallelism for superscalar and superpipelined machines

ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
A study of branch prediction strategies

ISCA '81 Proceedings of the 8th annual symposium on Computer Architecture

The floating point performance of a superscalar SPARC processor

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
OHMEGA: a VLSI superscalar processor architecture for numerical applications

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
On the attributes of the SCISM organization

ACM SIGARCH Computer Architecture News
SCISM: a scalable compound instruction set machine

IBM Journal of Research and Development
The 16-fold way: a microparallel taxonomy

MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
Optimizing a Superscalar Machine to Run Vector Code

IEEE Parallel & Distributed Technology: Systems & Technology
Efficient Instruction Sequencing with Inline Target Insertion

IEEE Transactions on Computers
SWIFT: Software Implemented Fault Tolerance

Proceedings of the international symposium on Code generation and optimization
Design and Evaluation of Hybrid Fault-Detection Systems

Proceedings of the 32nd annual international symposium on Computer Architecture
Software-controlled fault tolerance

ACM Transactions on Architecture and Code Optimization (TACO)
Static typing for a faulty lambda calculus

Proceedings of the eleventh ACM SIGPLAN international conference on Functional programming
Fault-tolerant typed assembly language

Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Automatic Instruction-Level Software-Only Recovery

IEEE Micro
A load-instruction unit for pipelined processors

IBM Journal of Research and Development
Analysis of single-event effects in embedded processors for non-uniform fault tolerant design

IIT'09 Proceedings of the 6th international conference on Innovations in information technology
DAFT: decoupled acyclic fault tolerance

Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Runtime asynchronous fault tolerance via speculation

Proceedings of the Tenth International Symposium on Code Generation and Optimization

Quantified Score

Hi-index	0.01

Visualization

Abstract

This paper describes the architecture for issuing multiple instructions per clock in the NonStop Cyclone Processor. Pairs of instructions are fetched and decoded by a dual two-stage prefetch pipeline and passed to a dual six-stage pipeline for execution. Dynamic branch prediction is used to reduce branch penalties. A unique microcode routine for each pair is stored in the large duplexed control store. The microcode controls parallel data paths optimized for executing the most frequent instruction pairs. Other features of the architecture include cache support for unaligned double-precision accesses, a virtually-addressed main memory, and a novel precise exception mechanism.