The SPLASH-2 programs: characterization and methodological considerations
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Flexible use of memory for replication/migration in cache-coherent DSM multiprocessors
Proceedings of the 25th annual international symposium on Computer architecture
SCI: Scalable Coherent Interface, Architecture and Software for High-Performance Compute Clusters
SCI: Scalable Coherent Interface, Architecture and Software for High-Performance Compute Clusters
Visualizing the Memory Access Behavior of Shared Memory Applications on NUMA Architectures
ICCS '01 Proceedings of the International Conference on Computational Science-Part II
Supporting Shared Memory and Message Passing on Clusters of PCs with a SMiLE
CANPC '99 Proceedings of the Third International Workshop on Network-Based Parallel Computing: Communication, Architecture, and Applications
User-Level Dynamic Page Migration for Multiprogrammed Shared-Memory Multiprocessors
ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
OS Support for Improving Data Locality on CC-NUMA Compute Servers
OS Support for Improving Data Locality on CC-NUMA Compute Servers
Interactive locality optimization on NUMA architectures
Proceedings of the 2003 ACM symposium on Software visualization
A flexible and dynamic page migration infrastructure based on hardware counters
The Journal of Supercomputing
Hi-index | 0.00 |
This Paper presents an approach which dynamically and transparently improves the data locality of memory references in Non-Uniform Memory Access (NUMA) characterized systems. The approach is based on run-time data redistribution via user-level page migration. It uses memory access histograms gathered by hardware monitors to make correct decisions related to the placement of shared data. First performance experiments on several applications show the potential for a significant gain in speedup. In addition, a graphical user interface has been developed showing the actual data movement thereby helping the user to understand the behavior of the application and to detect performance bottlenecks. This feature complements an already existing Data Layout Visualization tool for the observation of memory locality.