Memory coherence in shared virtual memory systems
PODC '86 Proceedings of the fifth annual ACM symposium on Principles of distributed computing
An in-cache address translation mechanism
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Operating systems: design and implementation
Operating systems: design and implementation
Caching in the Sprite network file system
ACM Transactions on Computer Systems (TOCS)
The Wisconsin multicube: a new large-scale cache-coherent multiprocessor
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Page table management in local/remote architectures
ICS '88 Proceedings of the 2nd international conference on Supercomputing
ACM SIGNUM Newsletter
Simple but effective techniques for NUMA memory management
SOSP '89 Proceedings of the twelfth ACM symposium on Operating systems principles
SOSP '89 Proceedings of the twelfth ACM symposium on Operating systems principles
NUMA policies and their relation to memory architecture
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
PPOPP '91 Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
Experimental comparison of memory management policies for NUMA multiprocessors
ACM Transactions on Computer Systems (TOCS)
The robustness of NUMA memory management
SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
An analysis of dynamic page placement on a NUMA multiprocessor
SIGMETRICS '92/PERFORMANCE '92 Proceedings of the 1992 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Operating system support for improving data locality on CC-NUMA compute servers
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Is data distribution necessary in OpenMP?
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Accuracy of Memory Reference Traces of Parallel Computations in Trace-Drive Simulation
IEEE Transactions on Parallel and Distributed Systems
Evaluation of NUMA Memory Management Through Modeling and Measurements
IEEE Transactions on Parallel and Distributed Systems
A transparent runtime data distribution engine for OpenMP
Scientific Programming
FELI: HW/SW support for on-chip distributed shared memory in multicores
Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
Hi-index | 0.00 |
We address the problem of paged main memory management in the local/remote architecture subclass of shared memory multiprocessors. We consider the case where the operating system has primary responsibility and uses page migration as its main tool. We identify some of the key issues with respect to architectural support (reference history maintenance, and page size), and operating system mechanism (duration between daemon passes, and number of migration daemons).The experiments were conducted using software implemented page tables on 32-node BBN Butterfly Plus™. Several numeral programs with both synthetic and real data were used as the workload. The primary conclusion is that for the cases considered migration was at best marginally effective. On the other hand, practical migration mechanisms were robust and never significantly degraded performance. The specific results include: 1) Referenced bits with aging can closely approximate Usage fields, 2) larger page sizes are beneficial except when the page is large enough to include locality sets of two processes, and 3) multiple migration daemons can be useful.Only small regions of the space of architectural, system, and workload parameters were explored. Further investigation of other parameter combinations is clearly warranted.