Fixed and Adaptive Sequential Prefetching in Shared Memory Multiprocessors

Authors:
Fredrik Dahlgren;Michel Dubois;Per Stenstrom
Affiliations:
Lund University;University of Southern California;Lund University
Venue:
ICPP '93 Proceedings of the 1993 International Conference on Parallel Processing - Volume 01
Year:
1993

Citing 0
Cited 27

An adaptive sequential prefetching scheme in shared-memory multiprocessors

ICPP '97 Proceedings of the international Conference on Parallel Processing
Sequential Unification and Aggressive Lookahead Mechanisms for Data Memory Accesses

PaCT '999 Proceedings of the 5th International Conference on Parallel Computing Technologies
Compiler-Directed Cache Assist Adaptivity

ISHPC '00 Proceedings of the Third International Symposium on High Performance Computing
Effectiveness of hardware-based stride and sequential prefetching in shared-memory multiprocessors

HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Reducing Cache Pollution via Dynamic Data Prefetch Filtering

IEEE Transactions on Computers
Optimal multistream sequential prefetching in a shared cache

ACM Transactions on Storage (TOS)
Improving SDRAM access energy efficiency for low-power embedded systems

ACM Transactions on Embedded Computing Systems (TECS)
Data access history cache and associated data prefetching mechanisms

Proceedings of the 2007 ACM/IEEE conference on Supercomputing
TaP: table-based prefetching for storage caches

FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
Server-based data push architecture for multi-processor environments

Journal of Computer Science and Technology
Low-Cost Adaptive Data Prefetching

Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
A compiler-directed data prefetching scheme for chip multiprocessors

Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
Revisiting Cache Block Superloading

HiPEAC '09 Proceedings of the 4th International Conference on High Performance Embedded Architectures and Compilers
Coordinated control of multiple prefetchers in multi-core systems

Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Inter-core cooperative TLB for chip multiprocessors

Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Timing local streams: improving timeliness in data prefetching

Proceedings of the 24th ACM International Conference on Supercomputing
An Adaptive Data Prefetcher for High-Performance Processors

CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
Adaptive prefetching for shared cache based chip multiprocessors

Proceedings of the Conference on Design, Automation and Test in Europe
Improving cache locality for thread-level speculation

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
A cost-intelligent application-specific data layout scheme for parallel file systems

Proceedings of the 20th international symposium on High performance distributed computing
Bandwidth constrained coordinated HW/SW prefetching for multicores

Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
Global-aware and multi-order context-based prefetching for high-performance processors

International Journal of High Performance Computing Applications
ABS: A low-cost adaptive controller for prefetching in a banked shared last-level cache

ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
An effective instruction cache prefetch policy by exploiting cache history information

EUC'05 Proceedings of the 2005 international conference on Embedded and Ubiquitous Computing
Application-Specific hardware-driven prefetching to improve data cache performance

ACSAC'05 Proceedings of the 10th Asia-Pacific conference on Advances in Computer Systems Architecture
Algorithm-level Feedback-controlled Adaptive data prefetcher: Accelerating data access for high-performance processors

Parallel Computing
Cost-intelligent application-specific data layout optimization for parallel file systems

Cluster Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

To offset the effect of read miss penalties on processor utilization in shared-memory multiprocessors, several software- and hardware-based data prefetching schemes have been proposed. A major advantage of hardware tech niques is that they need no support from the programmer or compiler. Sequential prefetching is a simple hardware-controlled prefetching technique which relies on the automatic prefetch of consecutive blocks following the block that misses in the cache. In its simplest form, the number of prefetched blocks on each miss is fixed throughout the exe cution. However, since the prefetching efficiency varies during the execution of a program, we propose to adapt the number of pref etched blocks according to a dynamic measure of prefetching effectiveness. Simulations of this adaptive scheme show significant reductions of the read penalty and of the overall execution time.