Analysis of hardware prefetching across virtual page boundaries

Authors:
Ronald G. Dreslinski;Ali G. Saidi;Trevor Mudge;Steven K. Reinhardt
Affiliations:
Advanced Computer Architecture Lab, Ann Arbor, MI;Advanced Computer Architecture Lab, Ann Arbor, MI;Advanced Computer Architecture Lab, Ann Arbor, MI;Advanced Computer Architecture Lab, Ann Arbor, MI
Venue:
Proceedings of the 4th international conference on Computing frontiers
Year:
2007

Citing 12
Cited 0

Reducing memory latency via non-blocking and prefetching caches

ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Stride directed prefetching in scalar processors

MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Prefetching using Markov predictors

Proceedings of the 24th annual international symposium on Computer architecture
Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Automatically characterizing large scale program behavior

Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
SPEC CPU2000: Measuring CPU Performance in the New Millennium

Computer
Stride-directed Prefetching for Secondary Caches

ICPP '97 Proceedings of the international Conference on Parallel Processing
AC/DC: An Adaptive Data Cache Prefetcher

Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
MicroLib: A Case for the Quantitative Comparison of Micro-Architecture Mechanisms

Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Data Cache Prefetching Using a Global History Buffer

HPCA '04 Proceedings of the 10th International Symposium on High Performance Computer Architecture
Implementing Caches in a 3D Technology for High Performance Processors

ICCD '05 Proceedings of the 2005 International Conference on Computer Design
Linux physical memory analysis

ATEC '05 Proceedings of the annual conference on USENIX Annual Technical Conference

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data cache prefetching in the L2 is at the forefront of pre-fetching research. In this paper we analyze the impact of virtual page boundaries on these prefetchers. Conservative measurements on real hardware show that 30-50% of consecutive virtual pages are mapped to pages which are not consecutive in physical memory. Advanced hardware prefetching techniques that detect access patterns which span virtual page boundaries often end up prefetching data that is from the wrong physical page. Meanwhile, current simulation techniques for evaluating prefetching algorithms assume that all virtual pages are mapped consecutively. We show that not accounting for virtual page boundaries in simulation can lead to overestimates of as much as 29% (9% on average). We also show that a simple prefetch filter can improve performance up to 32% (7% on average) and recover the overestimated performance. This leads to the conclusion that although previous simulations may not have accounted for virtual page boundaries, the results they demonstrate are still attainable and that it is not necessary to simulate virtual page boundaries to get accurate results. However, actual hardware designers should take care to implement a simple filter or else their hardware may not show the same gains in performance as they did in simulation.