Wrong-path instruction prefetching

Authors:
Jim Pierce;Trevor Mudge
Affiliations:
Intel Corporation;University of Michigan
Venue:
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Year:
1996

Citing 13
Cited 30

A Case for Direct-Mapped Caches

Computer
Improving performance of small on-chip instruction caches

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
An architecture for software-controlled data prefetching

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Reducing memory latency via non-blocking and prefetching caches

ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Prefetching in supercomputer instruction caches

Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Designing the TFP Microprocessor

IEEE Micro
Cache behavior in the presence of speculative execution: the benefits of misprediction

Cache behavior in the presence of speculative execution: the benefits of misprediction
Instruction cache fetch policies for speculative execution

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Informing memory operations: providing memory performance feedback in modern processors

ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Cache Memories

ACM Computing Surveys (CSUR)
The Effect of Speculative Execution on Cache Performance

Proceedings of the 8th International Symposium on Parallel Processing
Lockup-free instruction fetch/prefetch cache organization

ISCA '81 Proceedings of the 8th annual symposium on Computer Architecture
Aspects of cache memory and instruction buffer performance

Aspects of cache memory and instruction buffer performance

Improving data cache performance by pre-executing instructions under a cache miss

ICS '97 Proceedings of the 11th international conference on Supercomputing
Multipath execution: opportunities and limits

ICS '98 Proceedings of the 12th international conference on Supercomputing
Pipeline gating: speculation control for energy reduction

Proceedings of the 25th annual international symposium on Computer architecture
Fetch directed instruction prefetching

Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Branch Prediction, Instruction-Window Size, and Cache Size: Performance Trade-Offs and Simulation Techniques

IEEE Transactions on Computers
Procedure placement using temporal-ordering information

ACM Transactions on Programming Languages and Systems (TOPLAS)
Architectural and compiler support for effective instruction prefetching: a cooperative approach

ACM Transactions on Computer Systems (TOCS)
Execution history guided instruction prefetching

ICS '02 Proceedings of the 16th international conference on Supercomputing
Exploiting the Prefetching Effect Provided by Executing Mispredicted Load Instructions

Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Accurate timing analysis by modeling caches, speculation and their interaction

Proceedings of the 40th annual Design Automation Conference
Call graph prefetching for database applications

ACM Transactions on Computer Systems (TOCS)
Execution History Guided Instruction Prefetching

The Journal of Supercomputing
Modeling control speculation for timing analysis

Real-Time Systems
The Impact of Incorrectly Speculated Memory Operations in a Multithreaded Architecture

IEEE Transactions on Parallel and Distributed Systems
Effective Instruction Prefetching via Fetch Prestaging

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Understanding the effects of wrong-path memory references on processor performance

WMPI '04 Proceedings of the 3rd workshop on Memory performance issues: in conjunction with the 31st international symposium on computer architecture
Exploring the limits of prefetching

IBM Journal of Research and Development - Electrochemical technology in microelectronics
The instruction register file micro-architecture

Future Generation Computer Systems - Special issue: Parallel computing technologies
An Analysis of the Performance Impact of Wrong-Path Memory References on Out-of-Order and Runahead Execution Processors

IEEE Transactions on Computers
WCET analysis of instruction caches with prefetching

Proceedings of the 2007 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
A low power front-end for embedded processors using a block-aware instruction set

CASES '07 Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems
The impact of wrong-path memory references in cache-coherent multiprocessor systems

Journal of Parallel and Distributed Computing
Analyzing the worst-case execution time for instruction caches with prefetching

ACM Transactions on Embedded Computing Systems (TECS)
The instruction register file micro-architecture

Future Generation Computer Systems - Special issue: Parallel computing technologies
Speculative-aware execution: a simple and efficient technique for utilizing multi-cores to improve single-thread performance

Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Quantifying and reducing the effects of wrong-path memory references in cache-coherent multiprocessor systems

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Proactive instruction fetch

Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Reducing Power and Energy Overhead in Instruction Prefetching for Embedded Processor Systems

International Journal of Handheld Computing Research
Reconciling real-time guarantees and energy efficiency through unlocked-cache prefetching

Proceedings of the 50th Annual Design Automation Conference
RDIP: return-address-stack directed instruction prefetching

Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture

Quantified Score

Hi-index	0.01

Visualization

Abstract

Instruction cache misses can severely limit the performance of both superscalar processors and high speed sequential machines. Instruction prefetch algorithms attempt to reduce the performance degradation by bringing lines into the instruction cache before they are needed by the CPU fetch unit. There have been several algorithms proposed to do this, most notably next line prefetching and target prefetching. We propose a new scheme called wrong-path prefetching which combines next-line prefetching with the prefetching of all control instruction targets regardless of the predicted direction of conditional branches. The algorithm substantially reduces the cycles lost to instruction cache misses while somewhat increasing the amount of memory traffic. Wrong-path prefetching performs better than the other prefetch algorithms studied in all of the cache configurations examined while requiring little additional hardware. For example, the best wrong-path prefetch algorithm can result in a speed up of 16% when using an 8K instruction cache. In fact, an 8K wrong-path prefetched instruction cache is shown to achieve the same miss rate as a 32K non-prefetch cache. Finally, it is shown that wrong-path prefetching is applicable to both multi-issue and long L1 miss latency machines.