DSTRIDE: data-cache miss-address-based stride prefetching scheme for multimedia processors

Authors:
Hariprakash. G; Achutharaman. R;Amos R. Omondi
Affiliations:
Sun Microsystems, Singapore;Sun Microsystems, Singapore;N4 Nanyang Avenue, Nanyang Technological University, Singapore
Venue:
ACSAC '01 Proceedings of the 6th Australasian conference on Computer systems architecture
Year:
2001

Citing 11
Cited 0

An effective on-chip preloading scheme to reduce data access penalty

Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Evaluating stream buffers as a secondary cache replacement

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Cache miss heuristics and preloading techniques for general-purpose programs

Proceedings of the 28th annual international symposium on Microarchitecture
Evaluation of Hardware-Based Stride and Sequential Prefetching in Shared-Memory Multiprocessors

IEEE Transactions on Parallel and Distributed Systems
Tango: a hardware-based data prefetching technique for superscalar processors

Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
CPU Cache Prefetching: Timing Evaluation of Hardware Implementations

IEEE Transactions on Computers
Prefetching Using Markov Predictors

IEEE Transactions on Computers - Special issue on cache memory and related problems
Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
MPEG-2 Video Decompression on Simultaneous Multithreaded Multimedia Processors

PACT '99 Proceedings of the 1999 International Conference on Parallel Architectures and Compilation Techniques
A Comparison of Hardware Prefetching Techniques For Multimedia Benchmarks

A Comparison of Hardware Prefetching Techniques For Multimedia Benchmarks

Quantified Score

Hi-index	0.00

Visualization

Abstract

Prefetching reduces cache miss latency by moving data up in memory hierarchy before they are actually needed. Recent hardware-based stride prefetching techniques mostly rely on the processor pipeline information (e.g. program counter and branch prediction table) for prediction. Continuing developments in processor microarchitecture drastically change core pipeline design and require that existing hardware-based stride prefetching techniques be adapted to the evolving new processor architectures.In this paper we present a new hardware-based stride prefetching technique, called DStride, that is independent of processor pipeline design changes. In this new design, the first-level data cache miss address stream is used for the stride prediction. The miss addresses are separated into load stream and store stream to increase the efficiency of the predictor. They are checked separately against the recent miss address stream to detect the strides. The detected steady strides are maintained in a table that also performs look-ahead stride prefetching when the processor stride reference rate is higher than the prefetch request service rate.We evaluated our design with multimedia workloads using execution-driven simulation with SimpleScalar toolset. Our experiments show that DStride is very effective in reducing overall pipeline stalls due to cache miss latency, especially for stride-intensive applications such as multimedia workloads.