High-Performance 3-1 Interlock Collapsing ALU's

Authors:
J. Phillips;S. Vassiliadis
Affiliations:
-;-
Venue:
IEEE Transactions on Computers
Year:
1994

Citing 10
Cited 21

An Instruction Issuing Approach to Enhancing Performance in Multiple Functional Unit Processors

IEEE Transactions on Computers
The WM computer architecture

ACM SIGARCH Computer Architecture News
Available instruction-level parallelism for superscalar and superpipelined machines

ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
The Nonuniform Distribution of Instruction-Level and Machine Parallelism and its Effect on Performance

IEEE Transactions on Computers
IBM RISC System/6000 processor architecture

IBM Journal of Research and Development
Design of the IBM RISC System/6000 floating-point execution unit

IBM Journal of Research and Development
Instruction-level parallelism from execution interlock collapsing

ACM SIGARCH Computer Architecture News
The Architecture of Symbolic Computers

The Architecture of Symbolic Computers
Introduction to Arithmetic for Digital Systems Designers

Introduction to Arithmetic for Digital Systems Designers
Interlock Collapsing ALU's

IEEE Transactions on Computers

Instruction-level parallelism from execution interlock collapsing

ACM SIGARCH Computer Architecture News
On the attributes of the SCISM organization

ACM SIGARCH Computer Architecture News
Interlock collapsing ALU for increased instruction-level parallelism

MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
SCISM: a scalable compound instruction set machine

IBM Journal of Research and Development
The performance potential of data dependence speculation & collapsing

Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
CHIMAERA: a high-performance architecture with a tightly-coupled reconfigurable functional unit

Proceedings of the 27th annual international symposium on Computer architecture
A Java-Enabled DSP

Embedded Processor Design Challenges: Systems, Architectures, Modeling, and Simulation - SAMOS
A Java-enabled DSP

Embedded processor design challenges
Characterizing and predicting value degree of use

Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
From Sequences of Dependent Instructions to Functions: An Approach for Improving Performance without ILP or Speculation

Proceedings of the 31st annual international symposium on Computer architecture
Dynamic Strands: Collapsing Speculative Dependence Chains for Reducing Pipeline Communication

Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Dataflow Mini-Graphs: Amplifying Superscalar Capacity and Bandwidth

Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Application-Specific Processing on a General-Purpose Core via Transparent Instruction Set Customization

Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
An Architecture Framework for Transparent Instruction Set Customization in Embedded Processors

Proceedings of the 32nd annual international symposium on Computer Architecture
Exploring the design space of LUT-based transparent accelerators

Proceedings of the 2005 international conference on Compilers, architectures and synthesis for embedded systems
Scalable subgraph mapping for acyclic computation accelerators

CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Exploiting Narrow Accelerators with Data-Centric Subgraph Mapping

Proceedings of the International Symposium on Code Generation and Optimization
Proof of correctness of high-performance 3-1 interlock collapsing ALUs

IBM Journal of Research and Development
Mighty-morphing power-SIMD

CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
Comparing FPGA vs. custom cmos and the impact on processor microarchitecture

Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays
SoftHV: a HW/SW co-designed processor with horizontal and vertical fusion

Proceedings of the 8th ACM International Conference on Computing Frontiers

Quantified Score

Hi-index	14.98

Visualization

Abstract

A high-performance 3-1 interlock collapsing ALU, i.e., an ALU that allows the execution of most execution interlocks in a single machine cycle, is presented. We focus on reducing the Boolean equations describing the device and the incorporation of new mechanisms in the interlock collapsing ALU design. In particular, we focus on the reduction of the critical path, regarding delay, for the interlock collapsing ALU implementation. It is shown that the delay associated with the implementation of the proposed device, in terms of logic stages, assuming a commonly available CMOS technology, is equivalent to the number of logic stages required for the implementation of a 3-1 binary adder. The resulting implementation demonstrates that the proposed 3-1 interlock collapsing ALU can be designed to outperform existing schemes for interlock collapsing ALU's by a factor of at least two. Finally, it is suggested that the proposed device can be used in the implementation of multiple instruction issuing machines, allowing the issuance and execution of interlocks in parallel and in a single machine cycle with no cycle time increases.