User-Level Dynamic Page Migration for Multiprogrammed Shared-Memory Multiprocessors

Authors:
Dimitrios S. Nikolopoulos;Theodore S. Papatheodorou;Constantine D. Polychronopoulos;Jesús Labarta;Eduard Ayguadé
Affiliations:
-;-;-;-;-
Venue:
ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
Year:
2000

Citing 11
Cited 18

Process control and scheduling issues for multiprogrammed shared-memory multiprocessors

SOSP '89 Proceedings of the twelfth ACM symposium on Operating systems principles
Scheduling and page migration for multiprocessor compute servers

ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Operating system support for improving data locality on CC-NUMA compute servers

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
The SGI Origin: a ccNUMA highly scalable server

Proceedings of the 24th annual international symposium on Computer architecture
Flexible use of memory for replication/migration in cache-coherent DSM multiprocessors

Proceedings of the 25th annual international symposium on Computer architecture
Scaling application performance on a cache-coherent multiprocessor

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Performance experiences on Sun's Wildfire prototype

SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
A case for user-level dynamic page migration

Proceedings of the 14th international conference on Supercomputing
Using simple page placement policies to reduce the cost of cache fills in coherent shared-memory systems

IPPS '95 Proceedings of the 9th International Symposium on Parallel Processing
UPMLIB: A Runtime System for Tuning the Memory Performance of OpenMP Programs on Scalable Shared-Memory Multiprocessors

LCR '00 Selected Papers from the 5th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
WildFire: A Scalable Path for SMPs

HPCA '99 Proceedings of the 5th International Symposium on High Performance Computer Architecture

Is data distribution necessary in OpenMP?

Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Improving Data Locality Using Dynamic Page Migration Based on Memory Access Histograms

ICCS '02 Proceedings of the International Conference on Computational Science-Part II
Implementing OpenMP Using Dataflow Execution Model for Data Locality and Efficient Parallel Execution

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
UPMLIB: A Runtime System for Tuning the Memory Performance of OpenMP Programs on Scalable Shared-Memory Multiprocessors

LCR '00 Selected Papers from the 5th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
Evaluation of the memory page migration influence in the system performance: the case of the SGI O2000

ICS '03 Proceedings of the 17th annual international conference on Supercomputing
ARS: an adaptive runtime system for locality optimization

Future Generation Computer Systems - Tools for program development and analysis
Page migration with dynamic space-sharing scheduling policies: the case of the SGI 02000

International Journal of Parallel Programming - Special issue II: The 17th annual international conference on supercomputing (ICS'03)
Hardware profile-guided automatic page placement for ccNUMA systems

Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
A transparent runtime data distribution engine for OpenMP

Scientific Programming
Memory access behavior analysis of NUMA-based shared memory programs

Scientific Programming
Exploration of distributed shared memory architectures for NoC-based multiprocessors

Journal of Systems Architecture: the EUROMICRO Journal
Hardware monitors for dynamic page migration

Journal of Parallel and Distributed Computing
Dynamic Task and Data Placement over NUMA Architectures: An OpenMP Runtime Perspective

IWOMP '09 Proceedings of the 5th International Workshop on OpenMP: Evolving OpenMP in an Age of Extreme Parallelism
Feedback-directed page placement for ccNUMA via hardware-generated memory traces

Journal of Parallel and Distributed Computing
Dual-layered file cache on cc-NUMA system

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
FELI: HW/SW support for on-chip distributed shared memory in multicores

Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
Node-based memory management for scalable NUMA architectures

Proceedings of the 2nd International Workshop on Runtime and Operating Systems for Supercomputers
A flexible and dynamic page migration infrastructure based on hardware counters

The Journal of Supercomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents algorithms for improving the performance of parallel programs on multiprogrammed shared-memory NUMA multiprocessors, via the use of user-level dynamic page migration. The idea that drives the algorithms is that a page migration engine can perform accurate and timely page migrations in a multiprogrammed system if it can correlate page reference information with scheduling information obtained from the operating system. The necessary page migrations can be performed as a response to scheduling events that break the implicit association between threads and their memory affinity sets. We present two algorithms that use feedback from the kernel scheduler to aggressively migrate pages upon thread migrations. The first algorithm exploits the iterative nature of parallel programs, while the second targets generic codes without making assumptions on their structure. Performance evaluation on an SGI Origin2000 shows that our page migration algorithms provide substantial improvements in throughput of up to 264% compared to the native IRIX 6.5.5 page placement and migration schemes.