Stride directed prefetching in scalar processors
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
The SPLASH-2 programs: characterization and methodological considerations
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Data prefetching for software DSMs
ICS '98 Proceedings of the 12th international conference on Supercomputing
Prefetching Using Markov Predictors
IEEE Transactions on Computers - Special issue on cache memory and related problems
Runtime optimizations for a Java DSM implementation
Proceedings of the 2001 joint ACM-ISCOPE conference on Java Grande
Source-level global optimizations for fine-grain distributed shared memory systems
PPoPP '01 Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming
Dynamic hot data stream prefetching for general-purpose programs
PLDI '02 Proceedings of the ACM SIGPLAN 2002 Conference on Programming language design and implementation
A Comparison of Two Strategies of Dynamic Data Prefetching in Software DSM
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Delphi: Predition-based Page Prefetching to Improve the Performance of Shared Virtual Memory Systems
PDPTA '02 Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications - Volume 1
The Performance of Runtime Data Cache Prefetching in a Dynamic Optimization System
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Proceedings of the 32nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages
TreadMarks: distributed shared memory on standard workstations and operating systems
WTEC'94 Proceedings of the USENIX Winter 1994 Technical Conference on USENIX Winter 1994 Technical Conference
Improving performance of OpenMP for SMP clusters through overlapped page migrations
IWOMP'05/IWOMP'06 Proceedings of the 2005 and 2006 international conference on OpenMP shared memory parallel programming
Automatic Prefetching with Binary Code Rewriting in Object-Based DSMs
Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
Design of n-gram based dynamic pre-fetching for DSM
ICA3PP'12 Proceedings of the 12th international conference on Algorithms and Architectures for Parallel Processing - Volume Part II
Hi-index | 0.00 |
Prefetching transfers a data item in advance from its storage location to its usage location so that communication is hidden and does not delay computation. We present a novel prefetching technique for object-based Distributed Shared Memory (DSM) systems and discuss its implementation. In contrast to page-based DSMs, an object-based DSM distributes data on the level of objects, rendering current prefetchers for page-based DSMs unsuitable due to more complex data streams. To predict future data accesses, our prefetcher uses a new predictor (Esodyp+) based on a modified Markov model that automatically adapts to program behavior. We compare our prefetching strategy with both a stride prefetcher and the prefetcher of the Delphi DSM system. For several benchmarks our prefetching strategy reduces the number of network messages by about 60%. On 8 nodes, runtime is reduced by 15% on average. Hence, network-bound programs benefit from our solution. In contrast to the other predictors, Esodyp+ achieves a prediction accuracy above 80% with only 8% of unused prefetches for the benchmarks.