Hardware support for large atomic units in dynamically scheduled machines
MICRO 21 Proceedings of the 21st annual workshop on Microprogramming and microarchitecture
Increasing the instruction fetch rate via multiple branch prediction and a branch address cache
ICS '93 Proceedings of the 7th international conference on Supercomputing
Control flow prediction with tree-like subgraphs for superscalar processors
Proceedings of the 28th annual international symposium on Microarchitecture
Trace cache: a low latency approach to high bandwidth instruction fetching
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Alternative fetch and issue policies for the trace cache fetch mechanism
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Trace cache design for wide-issue superscalar processors
Trace cache design for wide-issue superscalar processors
Hi-index | 0.00 |
Trace caches are used to help dynamic branch prediction make multiple predic驴tions in a cycle by embedding some of the predictions in the trace. In this work, we evaluate a trace cache that is capable of delivering a trace consisting of a variable number of instructions via a linked list mechanism. We evaluate several schemes in the context of an x86 processor model that stores decoded instructions. By developing a new classification for trace cache accesses, we are able to target those misses that cause the largest performance loss. We have pro驴posed a hardware speculation technique, called NonHead Miss Speculation, which removes much of the penalty associated with nonhead misses in the eight applica驴tions we studied. Performance improvements ranged from 2% to 20%, with an average speedup of around 10% across our application suite.