Load latency tolerance in dynamically scheduled processors
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Evaluation of Design Options for the Trace Cache Fetch Mechanism
IEEE Transactions on Computers - Special issue on cache memory and related problems
Wide and efficient trace prediction using the local trace predictor
Proceedings of the 20th annual international conference on Supercomputing
Tree traversal scheduling: a global instruction scheduling technique for VLIW/EPIC processors
LCPC'01 Proceedings of the 14th international conference on Languages and compilers for parallel computing
Hi-index | 0.00 |
Rapid developments in the exploitation of instruction-level parallelism are prompting deeper-pipelined, wider machines with high issue rates. Speculative execution has been used to provide the required issue bandwidth. Current methods predict a single branch at a time. Performance improvement is possible by predicting multiple branches in a single cycle. The paper presents a technique to predict paths in a single access. The correlation of a path with the branches executed before it, is exploited to provide high prediction accuracy. A novel path prediction automaton is presented The automaton is easily scalable to predict long paths through arbitrary subgraphs. It also predicts a path through a subgraph in a single access. The automaton requires only n+1 bits for predicting the 2/sup n/ paths in a subgraph of depth n. The performance of the proposed path predictor is measured. The full path accuracy (accuracy in predicting all the branches in a path) is higher than or equal to other predictors found in the literature. This performance is achieved at a low hardware cost. The scalability single access prediction and low hardware cost of the path prediction technique presented in the paper make it suitable for machines requiring high issue bandwidth.