Lazy release consistency for software distributed shared memory
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
MPI versus MPI+OpenMP on IBM SP for the NAS benchmarks
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Extending OpenMP for NUMA machines
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
SPMD OpenMP versus MPI on a IBM SMP for 3 Kernels of the NAS Benchmarks
ISHPC '02 Proceedings of the 4th International Symposium on High Performance Computing
Hi-index | 0.00 |
OpenMP is emerging as a viable high-level programming model for shared memory parallel systems. Although it has also been implemented on ccNUMA architectures, it is hard to obtain high performance on such systems, particularly when large numbers of threads are involved. Moreover, it is applicable to NUMA machines only if a software DSM system is present. In this paper, we discuss various ways in which OpenMP may be used on ccNUMA and NUMA architectures, and evaluate several programming styles on the SGI Origin 2000, and on TreadMarks, a Software Distributed Shared Memory System from Rice University. These results have encouraged us to begin work on a compiler that accepts an extended OpenMP and translates such code to an equivalent version that provides superior performance on both of these platforms.