ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Implementing Precise Interrupts in Pipelined Processors
IEEE Transactions on Computers
Limits of instruction-level parallelism
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Single instruction stream parallelism is greater than two
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
IEEE Transactions on Computers
High-Performance 3-1 Interlock Collapsing ALU's
IEEE Transactions on Computers
Interlock collapsing ALU for increased instruction-level parallelism
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
IEEE Transactions on Computers
High-Performance 3-1 Interlock Collapsing ALU's
IEEE Transactions on Computers
Proof of correctness of high-performance 3-1 interlock collapsing ALUs
IBM Journal of Research and Development
Hi-index | 0.01 |
An innovative technique has been developed that permits the collapsing of execution interlocks between integer ALU operations as well as between address generation operations, allowing parallel execution of two instructions, having true dependencies, in a single cycle. Given that the proposed scheme has been shown not to increase the machine cycle time, it potentially provides an attractive means for increasing the instruction--level parallelism. Preliminary results show that within the basic blocks, the geometric mean of the speedup from this new design technique is up to 10% in the integer SPEC Benchmarks. The geometric mean of the speedup including floating point benchmarks is up to 6%. The results also suggest that depending on the application environment this new design may be used as an alternative to the relatively more expensive out--of--order instruction issue approach.