A PAB-based multi-prefetcher mechanism

Authors:
Alexander Gendler;Avi Mendelson;Yitzhak Birk
Affiliations:
Electrical Engineering Department, Technion, Haifa, Israel and Intel® Design Center, Haifa, Israel;Intel® Design Center, Haifa, Israel;Electrical Engineering Department, Technion, Haifa, Israel
Venue:
International Journal of Parallel Programming
Year:
2006

Citing 18
Cited 2

An effective on-chip preloading scheme to reduce data access penalty

Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Prefetch unit for vector operations on scalar computers (abstract)

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Stride directed prefetching in scalar processors

MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Adaptive data prefetching using cache information

ICS '97 Proceedings of the 11th international conference on Supercomputing
CPU Cache Prefetching: Timing Evaluation of Hardware Implementations

IEEE Transactions on Computers
Cache Memories

ACM Computing Surveys (CSUR)
Data prefetch mechanisms

ACM Computing Surveys (CSUR)
Architectural and compiler support for effective instruction prefetching: a cooperative approach

ACM Transactions on Computer Systems (TOCS)
Cache performance for selected SPEC CPU2000 benchmarks

ACM SIGARCH Computer Architecture News
SimpleScalar: An Infrastructure for Computer System Modeling

Computer
Effective Hardware-Based Data Prefetching for High-Performance Processors

IEEE Transactions on Computers
Multi-Chain Prefetching: Effective Exploitation of Inter-Chain Memory Parallelism for Pointer-Chasing Codes

Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques
Run-Time Adaptive Cache Management

HICSS '98 Proceedings of the Thirty-First Annual Hawaii International Conference on System Sciences-Volume 7 - Volume 7
DRAM-Page Based Prediction and Prefetching

ICCD '00 Proceedings of the 2000 IEEE International Conference on Computer Design: VLSI in Computers & Processors
Memory-Side Prefetching for Linked Data Structures

Memory-Side Prefetching for Linked Data Structures
Reducing Cache Pollution of Prefetching in a Small Data Cache

ICCD '01 Proceedings of the International Conference on Computer Design: VLSI in Computers & Processors
A New Voting Based Hardware Data Prefetch Scheme

HIPC '97 Proceedings of the Fourth International Conference on High-Performance Computing
Toward kilo-instruction processors

ACM Transactions on Architecture and Code Optimization (TACO)

Coordinated control of multiple prefetchers in multi-core systems

Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Resolving a L2-prefetch-caused parallel nonscaling on Intel Core microarchitecture

Journal of Parallel and Distributed Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Aggressive prefetching mechanisms improve performance of some important applications, but substantially increase bus traffic and "pressure" on cache tag arrays. They may even reduce performance of applications that are not memory bounded. We introduce a "feedback" mechanism, termed Prefetcher Assessment Buffer (PAB), which filters out requests that are unlikely to be useful. With this, applications that cannot benefit from aggressive prefetching will not suffer from their side-effects. The PAB is evaluated with different configurations, e.g., "all L1 accesses trigger prefetches" and "only misses to L1 trigger prefetches'. When compared with the non-selective concurrent use of multiple prefetchers, the PAB's application to prefetching from main memory to the L2 cache can reduce the number of loads from main memory by up to 25% without losing performance. Application of more sophisticated techniques to prefetches between the L2- and Ll-cache can increase IPC by 4% while reducing the traffic between the caches 8-fold.