When prefetching improves/degrades performance

Authors:
Thomas R. Puzak;A. Hartstein;P. G. Emma;V. Srinivasan
Affiliations:
IBM -- T. J. Watson Research Center, Yorktown Heights, NY;IBM -- T. J. Watson Research Center, Yorktown Heights, NY;IBM -- T. J. Watson Research Center, Yorktown Heights, NY;IBM -- T. J. Watson Research Center, Yorktown Heights, NY
Venue:
Proceedings of the 2nd conference on Computing frontiers
Year:
2005

Citing 6
Cited 3

Software prefetching

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
A performance study of software and hardware data prefetching schemes

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Correlation-based hardware prefetching

Correlation-based hardware prefetching
Understanding some simple processor-performance limits

IBM Journal of Research and Development - Special issue: performance analysis and its impact on design
A stateless, content-directed data prefetching mechanism

Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Pointer cache assisted prefetching

Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture

Reducing Cache Pollution via Dynamic Data Prefetch Filtering

IEEE Transactions on Computers
An analysis of the effects of miss clustering on the cost of a cache miss

Proceedings of the 4th international conference on Computing frontiers
Server-based data push architecture for multi-processor environments

Journal of Computer Science and Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

We formulate a new method for evaluating any prefetching algorithm (real or hypothetical). This method allows researchers to analyze the potential improvements prefetching can bring to an application independent of any known prefetching algorithm. We characterize prefetching with the metrics: timeliness, coverage, and accuracy. We demonstrate the usefulness of this method using a Markov prefetch algorithm. Under ideal conditions, prefetching can remove nearly all of the pipeline stalls associated with a cache miss. However, in today's processors, we show that nearly all of the performance benefits derived from prefetching are eroded and, in many cases, prefetching loses performance. We do quantitative analysis of these trade-offs, and show that there are linear relationships between overall performance and coverage, accuracy, and bandwidth