Improving performance of OpenMP for SMP clusters through overlapped page migrations

Authors:
Woo-Chul Jeun;Yang-Suk Kee;Soonhoi Ha
Affiliations:
Electrical Engineering and Computer Science, Seoul National University, Seoul, Korea;Computer Science & Engineering, University of California, San Diego;Electrical Engineering and Computer Science, Seoul National University, Seoul, Korea
Venue:
IWOMP'05/IWOMP'06 Proceedings of the 2005 and 2006 international conference on OpenMP shared memory parallel programming
Year:
2005

Citing 8
Cited 4

Memory coherence in shared virtual memory systems

ACM Transactions on Computer Systems (TOCS)
Using predictive prefetching to improve World Wide Web latency

ACM SIGCOMM Computer Communication Review
OpenMP for networks of SMPs

Journal of Parallel and Distributed Computing
Compiling Global Name-Space Parallel Loops for Distributed Execution

IEEE Transactions on Parallel and Distributed Systems
Comparative Evaluation of Latency Tolerance Techniques for Software Distributed Shared Memory

HPCA '98 Proceedings of the 4th International Symposium on High-Performance Computer Architecture
Optimizing OpenMP programs on software distributed shared memory systems

International Journal of Parallel Programming - Special issue: OpenMP: Experiences and implementations
A practical OpenMP compiler for system on chips

WOMPAT'03 Proceedings of the OpenMP applications and tools 2003 international conference on OpenMP shared memory parallel programming
Implementing an OpenMP execution environment on InfiniBand clusters

IWOMP'05/IWOMP'06 Proceedings of the 2005 and 2006 international conference on OpenMP shared memory parallel programming

Overcoming performance bottlenecks in using OpenMP on SMP clusters

Parallel Computing
Automatic Prefetching with Binary Code Rewriting in Object-Based DSMs

Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
FELI: HW/SW support for on-chip distributed shared memory in multicores

Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
Esodyp+: prefetching in the Jackal software DSM

Euro-Par'07 Proceedings of the 13th international Euro-Par conference on Parallel Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Costly page migration is a major obstacle to integrating OpenMP and page-based software distributed shared memory (SDSM) to realize the easy-touse programming paradigm for SMP clusters. To reduce the impact of the page migration overhead on the execution time of an application, the previous researches have mainly focused on reducing the number of page migrations and hiding the page migration overhead by overlapping computation and communication. We propose the 'collective-prefetch' technique, which overlaps page migrations themselves even when the prior approach cannot be effectively applied. Experiments with a communication-intensive application show that our technique reduces the page migration overhead significantly, and the overall execution time was reduced to 57%∼79%.