Reducing energy of virtual cache synonym lookup using bloom filters
CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Streamlining long latency instructions for seamlessly combined out-of-order and in-order execution
Microprocessors & Microsystems
Way guard: a segmented counting bloom filter approach to reducing energy for set-associative caches
Proceedings of the 14th ACM/IEEE international symposium on Low power electronics and design
Efficient system-on-chip energy management with a segmented bloom filter
ARCS'06 Proceedings of the 19th international conference on Architecture of Computing Systems
Hi-index | 0.00 |
As the performance gap between processors and memory systems increases, the CPU spends more time stalled waiting for data from main memory. Critical long latency instructions, such as loads that miss to main memory and floating point arithmetic operations, are primarily responsible for these stalls. We present a technique, Fetch Halting, that suspends instruction fetching when the processor is stalled by a critical long latency instruction. This enables us to save power in one of the primary sources of power dissipation, the issue logic. By reducing the occupancy rates in the issue queue and reorder buffer, we save power by disabling a large number of unused queue entries. In order to characterize critical instructions, our approach combines software profiling and hardware monitoring techniques. Statistical profiling information obtained from sample runs is used to identify critical instructions while hardware cache-miss prediction is used to monitor these instructions. We show that, on average, Fetch Halting can reduce issue queue and reorder buffer occupancy rates by 17.2% and 23.4% respectively, with an average performance loss of only 4.6%.