Reduced instruction set computer architectures for VLSI
Reduced instruction set computer architectures for VLSI
Computer
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Branch folding in the CRISP microprocessor: reducing branch delay to zero
ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
An evaluation of branch architectures
ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
WISQ: a restartable architecture using queues
ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
Architectural tradeoffs in the design of MIPS-X
ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
MICRO 15 Proceedings of the 15th annual workshop on Microprogramming
Hardware/software tradeoffs for increased performance
ASPLOS I Proceedings of the first international symposium on Architectural support for programming languages and operating systems
MIPS: a VLSI processor architecture
MIPS: a VLSI processor architecture
MIDEE: smoothing branch and instruction cache miss penalties on deep pipelines
MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
Reducing Branch Delay to Zero in Pipelined Processors
IEEE Transactions on Computers
Hi-index | 0.00 |
A mechanism to reduce the cost of branches in pipelined processors is presented. This technique is implemented by means of a non-conventional cache (branch target cache) and an early branch detection circuit. Branches are executed by the instruction fetch unit (IFU) in parallel with the other instructions. In this way, the execution time cost for many branches can be effectively reduced to zero. In order to obtain the IFU design parameters, the mechanism is evaluated by means of an analytical model. Simulation results show the effectiveness of this technique.