Instruction-level parallelism from execution interlock collapsing

Authors:
Nadeem Malik;Richard J. Eickemeyer;Stamatis Vassiliadis
Affiliations:
-;-;-
Venue:
ACM SIGARCH Computer Architecture News
Year:
1992

Citing 6
Cited 4

Reducing the cost of branches

ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Implementing Precise Interrupts in Pipelined Processors

IEEE Transactions on Computers
Limits of instruction-level parallelism

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Single instruction stream parallelism is greater than two

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Interlock Collapsing ALU's

IEEE Transactions on Computers
High-Performance 3-1 Interlock Collapsing ALU's

IEEE Transactions on Computers

Interlock collapsing ALU for increased instruction-level parallelism

MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Interlock Collapsing ALU's

IEEE Transactions on Computers
High-Performance 3-1 Interlock Collapsing ALU's

IEEE Transactions on Computers
Proof of correctness of high-performance 3-1 interlock collapsing ALUs

IBM Journal of Research and Development

Quantified Score

Hi-index	0.01

Visualization

Abstract

An innovative technique has been developed that permits the collapsing of execution interlocks between integer ALU operations as well as between address generation operations, allowing parallel execution of two instructions, having true dependencies, in a single cycle. Given that the proposed scheme has been shown not to increase the machine cycle time, it potentially provides an attractive means for increasing the instruction--level parallelism. Preliminary results show that within the basic blocks, the geometric mean of the speedup from this new design technique is up to 10% in the integer SPEC Benchmarks. The geometric mean of the speedup including floating point benchmarks is up to 6%. The results also suggest that depending on the application environment this new design may be used as an alternative to the relatively more expensive out--of--order instruction issue approach.