NUMA policies and their relation to memory architecture
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Experimental comparison of memory management policies for NUMA multiprocessors
ACM Transactions on Computer Systems (TOCS)
A case for user-level dynamic page migration
Proceedings of the 14th international conference on Supercomputing
Dynamic page placement to improve locality in CC-NUMA multiprocessors for TPC-C
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Improving Data Locality Using Dynamic Page Migration Based on Memory Access Histograms
ICCS '02 Proceedings of the International Conference on Computational Science-Part II
User-Level Dynamic Page Migration for Multiprogrammed Shared-Memory Multiprocessors
ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
Using Hardware Counters to Automatically Improve Memory Performance
Proceedings of the 2004 ACM/IEEE conference on Supercomputing
Hardware profile-guided automatic page placement for ccNUMA systems
Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Hardware monitors for dynamic page migration
Journal of Parallel and Distributed Computing
Enabling high-performance memory migration for multithreaded applications on LINUX
IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
Matching memory access patterns and data placement for NUMA systems
Proceedings of the Tenth International Symposium on Code Generation and Optimization
A Dynamic Cache Partitioning Mechanism under Virtualization Environment
TRUSTCOM '12 Proceedings of the 2012 IEEE 11th International Conference on Trust, Security and Privacy in Computing and Communications
Towards software performance engineering for multicore and manycore systems
ACM SIGMETRICS Performance Evaluation Review
Hi-index | 0.00 |
Performance counters, also known as hardware counters, are a powerful monitoring mechanism included in the Performance Monitoring Unit (PMU) of most of the modern microprocessors. Their use is gaining popularity as an analysis and validation tool for profiling, since their impact is virtually imperceptible and their precision has noticeably increased thanks to the new Precise Event-Based Sampling (PEBS) features.In this paper, we present and evaluate a novel user-level tool, based on hardware counters, for monitoring and migrating pages dynamically. This tool supports different migration strategies, being able to attach and monitor a target application without need to modify it whatsoever. The page migration process is performed timely and its overhead is overcome by the benefit of the data locality achieved.As a case study, an access-based migration algorithm was implemented and integrated into our tool. Performance results on a NUMA system show a noticeable reduction of remote accesses and execution time, achieving speedups of up to ~21 % in a multiprogrammed environment.