Compilers: principles, techniques, and tools
Compilers: principles, techniques, and tools
Highly concurrent scalar processing
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Efficient hardware for multiway jumps and pre-fetches
MICRO 18 Proceedings of the 18th annual workshop on Microprogramming
HPS, a new microarchitecture: rationale and introduction
MICRO 18 Proceedings of the 18th annual workshop on Microprogramming
Critical issues regarding HPS, a high performance microarchitecture
MICRO 18 Proceedings of the 18th annual workshop on Microprogramming
Checkpoint repair for high-performance out-of-order execution machines
IEEE Transactions on Computers
Exploiting fine-grained parallelism through a combination of hardware and software techniques
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
The expandable split window paradigm for exploiting fine-grain parallelsim
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Alternative implementations of two-level adaptive branch prediction
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Effective compiler support for predicated execution using the hyperblock
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
An efficient resource-constrained global scheduling technique for superscalar and VLIW processors
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Increasing the instruction fetch rate via multiple branch prediction and a branch address cache
ICS '93 Proceedings of the 7th international conference on Supercomputing
The superblock: an effective technique for VLIW and superscalar compilation
The Journal of Supercomputing - Special issue on instruction-level parallelism
Guarded execution and branch prediction in dynamic ILP processors
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Characterizing the impact of predicated execution on branch prediction
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Facilitating superscalar processing via a combined static/dynamic register renaming scheme
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Enhancing instruction scheduling with a block-structured ISA
International Journal of Parallel Programming
Optimization of instruction fetch mechanisms for high issue rates
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Control flow prediction with tree-like subgraphs for superscalar processors
Proceedings of the 28th annual international symposium on Microarchitecture
Multiple-block ahead branch predictors
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Trace cache: a low latency approach to high bandwidth instruction fetching
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Path-based next trace prediction
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
A study of branch prediction strategies
ISCA '81 Proceedings of the 8th annual symposium on Computer Architecture
2n-way jump microinstruction hardware and an effective instruction binding method
MICRO 13 Proceedings of the 13th annual workshop on Microprogramming
Improving quasi-dynamic schedules through region slip
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Hi-index | 0.00 |
To exploit larger amounts of instruction level parallelism, processors are being built with wider issue widths and larger numbers of functional units. Instruction fetch rate must also be increased in order to effectively exploit the performance potential of such processors. Block-structured ISAs provide an effective means of increasing the instruction fetch rate. We define an optimization, called block enlargement, that can be applied to a block-structured ISA to increase the instruction fetch rate of a processor that implements that ISA. We have constructed a compiler that generates block-structured ISA code, and a simulator that models the execution of that code on a block-structured ISA processor. We show that for the SPECint95 benchmarks, the block-structured ISA improves the performance of an aggressive wide issue, dynamically scheduled processor by 15% while using simpler microarchitectural mechanisms to support wide issue and dynamic scheduling.