Prefetching Using Markov Predictors

Authors:
Doug Joseph;Dirk Grunwald
Affiliations:
IBM T.J. Watson Research Center, Yorktown Heights, NY;Univ. of Colorado, Boulder
Venue:
IEEE Transactions on Computers - Special issue on cache memory and related problems
Year:
1999

Citing 14
Cited 41

An architecture for software-controlled data prefetching

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Reducing memory latency via non-blocking and prefetching caches

ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Design and evaluation of a compiler algorithm for prefetching

ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Evaluating stream buffers as a secondary cache replacement

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Characterization of alpha AXP performance using TP and SPEC workloads

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Complexity/performance tradeoffs with non-blocking loads

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Contrasting characteristics and cache performance of technical and multi-user commercial workloads

ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Speeding up irregular applications in shared-memory multiprocessors: memory binding and group prefetching

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
A modified approach to data cache management

Proceedings of the 28th annual international symposium on Microarchitecture
SPAID: software prefetching in pointer- and call-intensive environments

Proceedings of the 28th annual international symposium on Microarchitecture
Cache miss heuristics and preloading techniques for general-purpose programs

Proceedings of the 28th annual international symposium on Microarchitecture
Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Distributed Prefetch-buffer/Cache Design for High Performance Memory Systems

HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
Data prefetch mechanisms for accelerating symbolic and numeric computation

Data prefetch mechanisms for accelerating symbolic and numeric computation

Hardware identification of cache conflict misses

Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Dead-block prediction & dead-block correlating prefetchers

ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Runtime identification of cache conflict misses: The adaptive miss buffer

ACM Transactions on Computer Systems (TOCS)
Characterizing the d-TLB behavior of SPEC CPU2000 benchmarks

SIGMETRICS '02 Proceedings of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Execution history guided instruction prefetching

ICS '02 Proceedings of the 16th international conference on Supercomputing
Going the distance for TLB prefetching: an application-driven study

ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
DSTRIDE: data-cache miss-address-based stride prefetching scheme for multimedia processors

ACSAC '01 Proceedings of the 6th Australasian conference on Computer systems architecture
Exploiting the Prefetching Effect Provided by Executing Mispredicted Load Instructions

Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Improving cache hit ratio by extended referencing cache lines

Journal of Computing Sciences in Colleges
Enhancing memory level parallelism via recovery-free value prediction

ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Detecting global stride locality in value streams

Proceedings of the 30th annual international symposium on Computer architecture
Execution History Guided Instruction Prefetching

The Journal of Supercomputing
Reducing disk I/O times using anticipatory movements of the disk head

Journal of Systems Architecture: the EUROMICRO Journal
Effective stream-based and execution-based data prefetching

Proceedings of the 18th annual international conference on Supercomputing
Compiler orchestrated prefetching via speculation and predication

ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Data Cache Prefetching Using a Global History Buffer

IEEE Micro
Exploiting Barriers to Optimize Power Consumption of CMPs

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Addressing mode driven low power data caches for embedded processors

WMPI '04 Proceedings of the 3rd workshop on Memory performance issues: in conjunction with the 31st international symposium on computer architecture
Automatic measurement of memory hierarchy parameters

SIGMETRICS '05 Proceedings of the 2005 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Enhancing Memory-Level Parallelism via Recovery-Free Value Prediction

IEEE Transactions on Computers
On the performance of trace locality of reference

Performance Evaluation - Performance modelling and evaluation of high-performance parallel and distributed systems
Spectral prefetcher: An effective mechanism for L2 cache prefetching

ACM Transactions on Architecture and Code Optimization (TACO)
Dynamic memory optimization using pool allocation and prefetching

ACM SIGARCH Computer Architecture News - Special issue on the 2005 workshop on binary instrumentation and application
Data prefetching in a cache hierarchy with high bandwidth and capacity

MEDEA '06 Proceedings of the 2006 workshop on MEmory performance: DEaling with Applications, systems and architectures
A comprehensive study of hardware/software approaches to improve TLB performance for java applications on embedded systems

Proceedings of the 2006 workshop on Memory system performance and correctness
Optimal multistream sequential prefetching in a shared cache

ACM Transactions on Storage (TOS)
Path: page access tracking to improve memory management

Proceedings of the 6th international symposium on Memory management
Data prefetching in a cache hierarchy with high bandwidth and capacity

ACM SIGARCH Computer Architecture News
Focused prefetching: performance oriented prefetching based on commit stalls

Proceedings of the 22nd annual international conference on Supercomputing
Prefetching with adaptive cache culling for striped disk arrays

ATC'08 USENIX 2008 Annual Technical Conference on Annual Technical Conference
Guided Prefetching Based on Runtime Access Patterns

ICCS '08 Proceedings of the 8th international conference on Computational Science, Part III
Low-Cost Adaptive Data Prefetching

Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
Automatic Prefetching with Binary Code Rewriting in Object-Based DSMs

Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
Capturing and optimizing the interactions between prefetching and cache line turnoff

Microprocessors & Microsystems
Markov Model Based Disk Power Management for Data Intensive Workloads

CCGRID '09 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid
Efficient Data Access Management for FPGA-Based Image Processing SoCs

RSP '09 Proceedings of the 2009 IEEE/IFIP International Symposium on Rapid System Prototyping
Extended histories: improving regularity and performance in correlation prefetchers

Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers
Cost Minimization with HPDFG and Data Mining for Heterogeneous DSP

Journal of Signal Processing Systems
Esodyp+: prefetching in the Jackal software DSM

Euro-Par'07 Proceedings of the 13th international Euro-Par conference on Parallel Processing
Reducing Power and Energy Overhead in Instruction Prefetching for Embedded Processor Systems

International Journal of Handheld Computing Research
OWL: cooperative thread array aware scheduling techniques for improving GPGPU performance

Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Prefetching is one approach to reducing the latency of memory operations in modern computer systems. In this paper, we describe the Markov prefetcher. This prefetcher acts as an interface between the on-chip and off-chip cache and can be added to existing computer designs. The Markov prefetcher is distinguished by prefetching multiple reference predictions from the memory subsystem, and then prioritizing the delivery of those references to the processor. This design results in a prefetching system that provides good coverage, is accurate, and produces timely results that can be effectively used by the processor. We also explored a range of techniques that can be used to reduce the bandwidth demands of prefetching, leading to improved memory system performance. In our cycle-level simulations, the Markov Prefetcher reduces the overall execution stalls due to instruction and data memory operations by an average of 54 percent for various commercial benchmarks while only using two-thirds the memory of a demand-fetch cache organization.