SoftFLASH: analyzing the performance of clustered distributed virtual shared memory
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Cashmere-2L: software coherent shared memory on a clustered remote-write network
Proceedings of the sixteenth ACM symposium on Operating systems principles
ICS '99 Proceedings of the 13th international conference on Supercomputing
Architectural requirements and scalability of the NAS parallel benchmarks
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Performance evaluation of the IBM SP and the Compaq AlphaServer SC
Proceedings of the 14th international conference on Supercomputing
Multi-protocol active messages on a cluster of SMP's
SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
IPPS '99/SPDP '99 Proceedings of the 13th International Symposium on Parallel Processing and the 10th Symposium on Parallel and Distributed Processing
Home-Based SVM Protocols for SMP Clusters: Design and Performance
HPCA '98 Proceedings of the 4th International Symposium on High-Performance Computer Architecture
Fine-Grain Software Distributed Shared Memory on SMP Clusters
HPCA '98 Proceedings of the 4th International Symposium on High-Performance Computer Architecture
Dual-Level Parallel Analysis of Harbor Wave Response Using MPI and OpenMP
International Journal of High Performance Computing Applications
International Journal of Parallel Programming
Evaluating the XMT Parallel Programming Model
HIPS '01 Proceedings of the 6th International Workshop on High-Level Parallel Programming Models and Supportive Environments
High-Level Data Mapping for Clusters of SMPs
HIPS '01 Proceedings of the 6th International Workshop on High-Level Parallel Programming Models and Supportive Environments
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Effective Cross-Platform, Multilevel Parallelism via Dynamic Adaptive Execution
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Language and Compiler Support for Hybrid-Parallel Programming on SMP Clusters
ISHPC '02 Proceedings of the 4th International Symposium on High Performance Computing
Communication Bandwidth of Parallel Programming Models on Hybrid Architectures
ISHPC '02 Proceedings of the 4th International Symposium on High Performance Computing
ISHPC '02 Proceedings of the 4th International Symposium on High Performance Computing
SPMD OpenMP versus MPI on a IBM SMP for 3 Kernels of the NAS Benchmarks
ISHPC '02 Proceedings of the 4th International Symposium on High Performance Computing
Performance Oriented Programming for NUMA Architectures
WOMPAT '01 Proceedings of the International Workshop on OpenMP Applications and Tools: OpenMP Shared Memory Parallel Programming
Dual-level parallelism for deterministic and stochastic CFD problems
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Message passing and shared address space parallelism on an SMP cluster
Parallel Computing
Performance comparison of MPI and three openMP programming styles on shared memory multiprocessors
Proceedings of the fifteenth annual ACM symposium on Parallel algorithms and architectures
ARMI: an adaptive, platform independent communication library
Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
Journal of Parallel and Distributed Computing - Special section best papers from the 2002 international parallel and distributed processing symposium
ParADE: An OpenMP Programming Environment for SMP Cluster Systems
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Scalability of hybrid programming for a CFD code on the earth simulator
Parallel Computing
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 4 - Volume 05
Applied Numerical Mathematics - 6th IMACS International symposium on iterative methods in scientific computing
Parallel Multiple Sequences Alignment in SMP Cluster
HPCASIA '05 Proceedings of the Eighth International Conference on High-Performance Computing in Asia-Pacific Region
Performance prediction through simulation of a hybrid MPI/OpenMP application
Parallel Computing - OpenMp
Performance Modeling of Communication and Computation in Hybrid MPI and OpenMP Applications
ICPADS '06 Proceedings of the 12th International Conference on Parallel and Distributed Systems - Volume 2
Scalable algorithms for molecular dynamics simulations on commodity clusters
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Development of mixed mode MPI / OpenMP applications
Scientific Programming
Performance evaluation of the Sun Fire Link SMP clusters
International Journal of High Performance Computing and Networking
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Overcoming performance bottlenecks in using OpenMP on SMP clusters
Parallel Computing
IWOMP '07 Proceedings of the 3rd international workshop on OpenMP: A Practical Programming Model for the Multi-Core Era
MPC: A Unified Parallel Runtime for Clusters of NUMA Machines
Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
IDEWEP: Web service for astronomical parallel image deconvolution
Journal of Network and Computer Applications
PRIB '08 Proceedings of the Third IAPR International Conference on Pattern Recognition in Bioinformatics
Efficient hybrid parallelisation of tiled algorithms on SMP clusters
International Journal of Computational Science and Engineering
International Journal of High Performance Computing Applications
Automatic Hybrid MPI+OpenMP Code Generation with llc
Proceedings of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
On the Need for a Consortium of Capability Centers
International Journal of High Performance Computing Applications
Applied Numerical Mathematics - 6th IMACS International symposium on iterative methods in scientific computing
SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics
The Importance of Non-Data-Communication Overheads in MPI
International Journal of High Performance Computing Applications
Algorithms for memory hierarchies: advanced lectures
Algorithms for memory hierarchies: advanced lectures
A characterization of shared data access patterns in UPC programs
LCPC'06 Proceedings of the 19th international conference on Languages and compilers for parallel computing
An evaluation of MPI and OpenMP paradigms for multi-dimensional data remapping
WOMPAT'03 Proceedings of the OpenMP applications and tools 2003 international conference on OpenMP shared memory parallel programming
An implementation of parallel eigenvalue computation using dual-level hybrid parallelism
ICA3PP'07 Proceedings of the 7th international conference on Algorithms and architectures for parallel processing
Early experiments with the OpenMP/MPI hybrid programming model
IWOMP'08 Proceedings of the 4th international conference on OpenMP in a new era of parallelism
Hybrid bulk synchronous parallelism library for clustered smp architectures
Proceedings of the fourth international workshop on High-level parallel programming and applications
A survey of algorithmic skeleton frameworks: high-level structured parallel programming enablers
Software—Practice & Experience - Focus on Selected PhD Literature Reviews in the Practical Aspects of Software Technology
Load balancing for regular meshes on SMPs with MPI
EuroMPI'10 Proceedings of the 17th European MPI users' group meeting conference on Recent advances in the message passing interface
ACM SIGMETRICS Performance Evaluation Review - Special issue on the 1st international workshop on performance modeling, benchmarking and simulation of high performance computing systems (PMBS 10)
Hybrid programming model for implicit PDE simulations on multicore architectures
IWOMP'11 Proceedings of the 7th international conference on OpenMP in the Petascale era
A framework for an automatic hybrid MPI+OpenMP code generation
Proceedings of the 19th High Performance Computing Symposia
A hybrid MPI/OpenMP implementation of a parallel 3-d FFT on SMP clusters
PPAM'05 Proceedings of the 6th international conference on Parallel Processing and Applied Mathematics
Solving the symmetric tridiagonal eigenproblem using MPI/OpenMP hybrid parallelization
APPT'05 Proceedings of the 6th international conference on Advanced Parallel Processing Technologies
Applying high performance computing techniques in astrophysics
PARA'04 Proceedings of the 7th international conference on Applied Parallel Computing: state of the Art in Scientific Computing
An OpenMP 3.1 validation testsuite
IWOMP'12 Proceedings of the 8th international conference on OpenMP in a Heterogeneous World
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
CUDA-for-clusters: a system for efficient execution of CUDA kernels on multi-core clusters
Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
Multi-level parallelism for incompressible flow computations on GPU clusters
Parallel Computing
Targeting distributed systems in fastflow
Euro-Par'12 Proceedings of the 18th international conference on Parallel processing workshops
Understanding parallelism in graph traversal on multi-core clusters
Computer Science - Research and Development
Experimenting with low-overhead OpenMP runtime on IBM Blue Gene/Q
IBM Journal of Research and Development
Energy estimation for MPI broadcasting algorithms in large scale HPC systems
Proceedings of the 20th European MPI Users' Group Meeting
Journal of Parallel and Distributed Computing
Hi-index | 0.00 |
The hybrid memory model of clusters of multiprocessors raises two issues: programming model and performance. Many parallel programs have been written by using the MPI standard. To evaluate the pertinence of hybrid models for existing MPI codes, we compare a unified model (MPI) and a hybrid one (OpenMP fine grain parallelization after profiling) for the NAS 2.3 benchmarks on two IBM SP systems. The superiority of one model depends on 1) the level of shared memory model parallelization, 2) the communication patterns and 3) the memory access patterns. The relative speeds of the main architecture components (CPU, memory, and network) are of tremendous importance for selecting one model. With the used hybrid model, our results show that a unified MPI approach is better for most of the benchmarks. The hybrid approach becomes better only when fast processors make the communication performance significant and the level of parallelization is sufficient.