On Reusing the Results of Pre-Executed Instructions in a Runahead Execution Processor

Authors:
Onur Mutlu;Hyesoon Kim;Jared Stark;Yale N. Patt
Affiliations:
-;-;-;-
Venue:
IEEE Computer Architecture Letters
Year:
2005

Citing 0
Cited 11

Techniques for Efficient Processing in Runahead Execution Engines

Proceedings of the 32nd annual international symposium on Computer Architecture
Dual-Core Execution: Building a Highly Scalable Single-Thread Instruction Window

Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Efficient Runahead Execution: Power-Efficient Memory Latency Tolerance

IEEE Micro
Tolerating Cache-Miss Latency with Multipass Pipelines

IEEE Micro
CAVA: Using checkpoint-assisted value prediction to hide L2 misses

ACM Transactions on Architecture and Code Optimization (TACO)
Hiding the misprediction penalty of a resource-efficient high-performance processor

ACM Transactions on Architecture and Code Optimization (TACO)
Improving single-thread performance with fine-grain state maintenance

Proceedings of the 5th conference on Computing frontiers
Reexecution and Selective Reuse in Checkpoint Processors

Transactions on High-Performance Embedded Architectures and Compilers II
Speculative-aware execution: a simple and efficient technique for utilizing multi-cores to improve single-thread performance

Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Minimal Multi-threading: Finding and Removing Redundant Instructions in Multi-threaded Processors

MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Tuning the continual flow pipeline architecture with virtual register renaming

ACM Transactions on Architecture and Code Optimization (TACO)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Previous research on runahead execution took it for granted as a prefetch-only technique. Even though the results of instructions independent of an L2 miss are correctly computed during runahead mode, previous approaches discarded those results instead of trying to utilize them in normal mode execution. This paper evaluates the effect of reusing the results of preexecuted instructions on performance. We find that, even with an ideal scheme, it is not worthwhile to reuse the results of preexecuted instructions. Our analysis provides insights into why result reuse does not provide significant performance improvement in runahead processors and concludes that runahead execution should be employed as a prefetching mechanism rather than a full-blown prefetching/result-reuse mechanism.