Highly concurrent scalar processing
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Efficient hardware for multiway jumps and pre-fetches
MICRO 18 Proceedings of the 18th annual workshop on Microprogramming
HPS, a new microarchitecture: rationale and introduction
MICRO 18 Proceedings of the 18th annual workshop on Microprogramming
Critical issues regarding HPS, a high performance microarchitecture
MICRO 18 Proceedings of the 18th annual workshop on Microprogramming
Exploiting fine-grained parallelism through a combination of hardware and software techniques
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Two-level adaptive training branch prediction
MICRO 24 Proceedings of the 24th annual international symposium on Microarchitecture
The expandable split window paradigm for exploiting fine-grain parallelsim
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Effective compiler support for predicated execution using the hyperblock
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
An efficient resource-constrained global scheduling technique for superscalar and VLIW processors
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Increasing the instruction fetch rate via multiple branch prediction and a branch address cache
ICS '93 Proceedings of the 7th international conference on Supercomputing
The superblock: an effective technique for VLIW and superscalar compilation
The Journal of Supercomputing - Special issue on instruction-level parallelism
Guarded execution and branch prediction in dynamic ILP processors
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Characterizing the impact of predicated execution on branch prediction
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Facilitating superscalar processing via a combined static/dynamic register renaming scheme
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Enhancing instruction scheduling with a block-structured ISA
International Journal of Parallel Programming
Optimization of instruction fetch mechanisms for high issue rates
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Instruction fetching: coping with code bloat
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Control flow prediction with tree-like subgraphs for superscalar processors
Proceedings of the 28th annual international symposium on Microarchitecture
Multiple-block ahead branch predictors
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
2n-way jump microinstruction hardware and an effective instruction binding method
MICRO 13 Proceedings of the 13th annual workshop on Microprogramming
Improving superscalar instruction dispatch and issue by exploiting dynamic code sequences
Proceedings of the 24th annual international symposium on Computer architecture
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
A Trace Cache Microarchitecture and Evaluation
IEEE Transactions on Computers - Special issue on cache memory and related problems
Evaluation of Design Options for the Trace Cache Fetch Mechanism
IEEE Transactions on Computers - Special issue on cache memory and related problems
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
A scalable front-end architecture for fast instruction delivery
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Optimizations Enabled by a Decoupled Front-End Architecture
IEEE Transactions on Computers
A design space evaluation of grid processor architectures
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
TRIPS: A polymorphous architecture for exploiting ILP, TLP, and DLP
ACM Transactions on Architecture and Code Optimization (TACO)
The instruction register file micro-architecture
Future Generation Computer Systems - Special issue: Parallel computing technologies
Block-aware instruction set architecture
ACM Transactions on Architecture and Code Optimization (TACO)
Merging Head and Tail Duplication for Convergent Hyperblock Formation
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Distributed Microarchitectural Protocols in the TRIPS Prototype Processor
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
The instruction register file micro-architecture
Future Generation Computer Systems - Special issue: Parallel computing technologies
Hi-index | 0.00 |
To exploit larger amounts of instruction level parallelism, processors are being built with wider issue widths and larger numbers of functional units. Instruction fetch rate must also be increased in order to effectively exploit the performance potential of such processors. Block-structured ISAs provide an effective means of increasing the instruction fetch rate. We define an optimization, called block enlargement, that can be applied to a block-structured ISA to increase the instruction fetch rate of a processor that implements that ISA. We have constructed a compiler that generates block-structured ISA code, and a simulator that models the execution of that code on a block-structured ISA processor. We show that for the SPECint95 benchmarks, the block-structured ISA processor executing enlarged atomic blocks outperforms a conventional ISA processor by 12% while using simpler microarchitectural mechanisms to support wide-issue and dynamic scheduling.