SoftFLASH: analyzing the performance of clustered distributed virtual shared memory
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Cashmere-2L: software coherent shared memory on a clustered remote-write network
Proceedings of the sixteenth ACM symposium on Operating systems principles
SIMPLE: a methodology for programming high performance algorithms on clusters of symmetric multiprocessors (SMPs)
OpenMP on networks of workstations
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Multi-protocol active messages on a cluster of SMP's
SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
Efficient Message Passing on Shared Memory Multiprocessors
EuroPVM '96 Proceedings of the Third European PVM Conference on Parallel Virtual Machine
Home-Based SVM Protocols for SMP Clusters: Design and Performance
HPCA '98 Proceedings of the 4th International Symposium on High-Performance Computer Architecture
Fine-Grain Software Distributed Shared Memory on SMP Clusters
HPCA '98 Proceedings of the 4th International Symposium on High-Performance Computer Architecture
A taxonomy of programming models for symmetric multiprocessors and SMP clusters
PMMP '95 Proceedings of the conference on Programming Models for Massively Parallel Computers
PaCT '01 Proceedings of the 6th International Conference on Parallel Computing Technologies
SCOPE - The Specific Cluster Operation and Performance Evaluation Benchmark Suite (Research Note)
Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
Multi-level parallelism for incompressible flow computations on GPU clusters
Parallel Computing
Hi-index | 0.01 |
The availability of multiprocessors and high performance networks offer the opportunity to build CLUMPs (Cluster of Multiprocessors) and use them as parallel computing platforms. The main distinctive feature of the CLUMP architecture over the usual parallel computers is its hybrid memory model (message passing between the nodes and shared memory inside the nodes). To be largely used, the CLUMPs must be able to execute the existing programs with few modifications. We investigate the performance of a programming approach based on the MPI for inter-multiprocessor communications and OpenMP standards for intra-multiprocessor exchanges. The approach consists in the intra-node parallelization of the MPI programs with an OpenMP directive based parallel compiler. The paper details the approach in the context of the biprocessor PC CLUMPs and presents a performance evaluation for the NAS parallel benchmarks.