Hardware prediction for data coherency of scientific codes on DSM

Authors:
J. T. Acquaviva;W. Jalby
Affiliations:
CEA/DAM, French Atomic Energy Commmission;PRiSM Lab., Versailles University
Venue:
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Year:
2000

Citing 14
Cited 0

Munin: distributed shared memory based on type-specific memory coherence

PPOPP '90 Proceedings of the second ACM SIGPLAN symposium on Principles & practice of parallel programming
Alternative implementations of two-level adaptive branch prediction

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
An adaptive cache coherence protocol optimized for migratory sharing

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Evaluating stream buffers as a secondary cache replacement

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
The SGI Origin: a ccNUMA highly scalable server

Proceedings of the 24th annual international symposium on Computer architecture
Using prediction to accelerate coherence protocols

Proceedings of the 25th annual international symposium on Computer architecture
Memory sharing predictor: the key to a speculative coherent DSM

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Parallel Computer Architecture: A Hardware/Software Approach

Parallel Computer Architecture: A Hardware/Software Approach
Effective Hardware-Based Data Prefetching for High-Performance Processors

IEEE Transactions on Computers
The Augmint multiprocessor simulation toolkit for Intel x86 architectures

ICCD '96 Proceedings of the 1996 International Conference on Computer Design, VLSI in Computers and Processors
An argument for simple COMA

HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
PRISM: An Integrated Architecture for Scalable Shared Memory

HPCA '98 Proceedings of the 4th International Symposium on High-Performance Computer Architecture
Improving CC-NUMA Performance Using Instruction-Based Prediction

HPCA '99 Proceedings of the 5th International Symposium on High Performance Computer Architecture
Identification and optimization of sharing patterns for scalable shared-memory multiprocessors

Identification and optimization of sharing patterns for scalable shared-memory multiprocessors

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes a hardware mechanism for reducing coherency overhead occurring in scientific computations within DSM systems. A first phase aims at detecting, in the address space regular patterns (called streams) of coherency events (such as requests for exclusive, shared or invalidation).Once a stream is detected at a loop level, regularity of data access can be exploited at the loop level (spatial locality) but also between loops (temporal locality). We present a hardware mechanism capable of detecting and exploiting efficiently these regular patterns. Expectable benefits as well as hardware complexity are discussed and the limited drawbacks and potential overheads are exposed. For a benchmarks suite of typical scientific applications results are very promising, both in terms of coherency streams and the effectiveness of our optimizations.