Hector: A Hierarchically Structured Shared-Memory Multiprocessor
Computer - Special issue on experimental research in computer architecture
NUMA policies and their relation to memory architecture
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Scheduler activations: effective kernel support for the user-level management of parallelism
SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
The robustness of NUMA memory management
SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
The directory-based cache coherence protocol for the DASH multiprocessor
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Consistency and event ordering in the shared regions model
CASCON '93 Proceedings of the 1993 conference of the Centre for Advanced Studies on Collaborative research: distributed computing - Volume 2
Hi-index | 0.00 |
In shared memory multiprocessors with NonUniform Memory Access (NUMA) characteristics, effective cacheing and memory locality are essential to performance. In this paper, we argue for a new approach for cache and NUMA memory management based upon the integration of application-sharing characteristics with system runtime management of shared data. An application's shared data is subdivided into shared regions of memory, and the application defines explicitly the operations on those regions. System runtime management can then achieve high cache hit rates and memory locality based upon these region specifications.Region-oriented cache management has the advantage over other software cache coherence techniques in that shared regions are always cacheable. Region-oriented main memory management improves upon traditional NUMA memory management by maintaining coherence at region granularity. This eliminates page-level false sharing as a concern for coherence and allows for more effective replication strategies. The specification of regions also permits the use of relaxed memory coherence protocols, which further improve performance.