Optimizing All-to-All Collective Communication by Exploiting Concurrency in Modern Networks
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Note: The distributed virtual shared-memory system based on the InfiniBand architecture
Journal of Parallel and Distributed Computing - Special issue: Design and performance of networks for super-, cluster-, and grid-computing: Part I
Study of OpenMP applications on the InfiniBand-based software distributed shared-memory system
Parallel Computing - OpenMp
Towards a more efficient implementation of OpenMP for clusters via translation to global arrays
Parallel Computing - OpenMp
Language support for multi-paradigm and multi-grain parallelism on SMP-Cluster
International Journal of Computers and Applications
Implementing an OpenMP execution environment on InfiniBand clusters
IWOMP'05/IWOMP'06 Proceedings of the 2005 and 2006 international conference on OpenMP shared memory parallel programming
WOMPAT'04 Proceedings of the 5th international conference on OpenMP Applications and Tools: shared Memory Parallel Programming with OpenMP
HiPC'04 Proceedings of the 11th international conference on High Performance Computing
Strategies and implementation for translating OpenMP code for clusters
HPCC'07 Proceedings of the Third international conference on High Performance Computing and Communications
Generating data transfers for distributed GPU parallel programs
Journal of Parallel and Distributed Computing
A compound OpenMP/MPI program development toolkit for hybrid CPU/GPU clusters
The Journal of Supercomputing
Hi-index | 0.00 |
In this paper, we examine some of the challenges present in providing support for OpenMP applications on a Software Distributed Shared Memory(DSM) based cluster system. We present detailed measurements of the performance characteristics of realistic OpenMP applications from the SPEC OMP2001 benchmarks. Based on these measurements, we discuss application and system characteristics that impede the efficient execution of these programs on a Software DSM system. We point out pitfalls of a naive translation approach from OpenMP into the API provided by a Software DSM system, and we discuss a set of possible program optimization techniques.