Multi-stage Cascaded Prediction
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Introducing entropies for representing program behavior and branch predictor performance
Proceedings of the 2007 workshop on Experimental computer science
Introducing entropies for representing program behaviors and branch predictor performances
ecs'07 Experimental computer science on Experimental computer science
Potentials of branch predictors: from entropy viewpoints
ARCS'08 Proceedings of the 21st international conference on Architecture of computing systems
Hi-index | 0.00 |
Two-level predictors deliver highly accurate conditional branch prediction, indirect branch target prediction and value prediction. Accurate prediction enables speculative execution of instructions, a technique that increases instruction level parallelism. Unfortunately, the accuracy of a two-level predictor is limited by the cost of the predictor table that stores associations between history patterns and target predictions. Two-stage cascaded prediction, a recently proposed hybrid prediction architecture, uses pattern filtering to reduce the cost of this table while preserving prediction accuracy. In this study we generalize two-stage prediction to multi-stage prediction. We first determine the limit of accuracy on an indirect branch trace using a multi-stage predictor with an unlimited hardware budget. We then investigate practical cascaded predictors with limited tables and a small number of stages. Compared to two-level prediction, multi-stage cascaded prediction delivers superior prediction accuracy for any given total table entry budget we considered. In particular, a 512-entry three-stage cascaded predictor reaches 92% accuracy, reducing table size by a factor of four compared to a two-level predictor. At 1.5K entries, a three-stage predictor reaches 94% accuracy, the hit rate of a hypothetical two-level predictor with an unlimited, fully associative predictor table. At 6K entries, accuracy increases to 95%, the limit achieved by an idealized twelve-stage cascaded predictor with an unlimited hardware budget. These results indicate that highly accurate indirect branch target prediction is now well within the capability of current hardware technology.