UPMLIB: A Runtime System for Tuning the Memory Performance of OpenMP Programs on Scalable Shared-Memory Multiprocessors

Authors:
Dimitrios S. Nikolopoulos;Theodore S. Papatheodorou;Constantine D. Polychronopoulos;Jesús Labarta;Eduard Ayguadé
Affiliations:
-;-;-;-;-
Venue:
LCR '00 Selected Papers from the 5th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
Year:
2000

Citing 7
Cited 6

Operating system support for improving data locality on CC-NUMA compute servers

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
The SGI Origin: a ccNUMA highly scalable server

Proceedings of the 24th annual international symposium on Computer architecture
Scaling application performance on a cache-coherent multiprocessor

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
A case for user-level dynamic page migration

Proceedings of the 14th international conference on Supercomputing
Parallel Computer Architecture: A Hardware/Software Approach

Parallel Computer Architecture: A Hardware/Software Approach
Leveraging Transparent Data Distribution in OpenMP via User-Level Dynamic Page Migration

ISHPC '00 Proceedings of the Third International Symposium on High Performance Computing
User-Level Dynamic Page Migration for Multiprogrammed Shared-Memory Multiprocessors

ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing

Quantifying and Resolving Remote Memory Access Contention on Hardware DSM Multiprocessors

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
User-Level Dynamic Page Migration for Multiprogrammed Shared-Memory Multiprocessors

ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
Feedback-directed page placement for ccNUMA via hardware-generated memory traces

Journal of Parallel and Distributed Computing
Dual-layered file cache on cc-NUMA system

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
A dynamic optimization framework for OpenMP

IWOMP'11 Proceedings of the 7th international conference on OpenMP in the Petascale era
Node-based memory management for scalable NUMA architectures

Proceedings of the 2nd International Workshop on Runtime and Operating Systems for Supercomputers

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present the design and implementation of UPMLIB, a runtime system that provides transparent facilities for dynamically tuning the memory performance of OpenMP programs on scalable shared-memory multiprocessors with hardware cache-coherence. UPMLIB integrates information from the compiler and the operating system, to implement algorithms that perform accurate and timely page migrations. The algorithms and the associated mechanisms correlate memory reference information with the semantics of parallel programs and scheduling events that break the association between threads and data for which threads have memory affinity at runtime. Our experimental evidence shows that UPMLIB makes OpenMP programs immune to the page placement strategy of the operating system, thus obviating the need for introducing data placement directives in OpenMP. Furthermore, UPMlib provides solid improvements of throughput in multiprogrammed execution environments.