Performance of hybrid message-passing and shared-memory parallelism for discrete element modeling
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
A comparison of three programming models for adaptive applications on the Origin2000
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Performance modeling and tuning of an unstructured mesh CFD application
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Dual-Level Parallel Analysis of Harbor Wave Response Using MPI and OpenMP
International Journal of High Performance Computing Applications
Communication Bandwidth of Parallel Programming Models on Hybrid Architectures
ISHPC '02 Proceedings of the 4th International Symposium on High Performance Computing
Communication and Optimization Aspects on Hybrid Architectures
Proceedings of the 9th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Approaches Based on Permutations for Partitioning Sparse Matrices on Multiprocessors
The Journal of Supercomputing
Hierarchical hybrid grids: achieving TERAFLOP performance on large scale finite element simulations
International Journal of Parallel, Emergent and Distributed Systems
Parallel symmetric sparse matrix-vector product on scalar multi-core CPUs
Parallel Computing
Hi-index | 0.01 |
Eigenvalue problems involving very large sparse matrices are common to various fields in science. In general, the numerical core of iterative eigenvalue algorithms is a matrix-vector multiplication (MVM) involving the large sparse matrix. We present three different programming approaches for parallel MVM on present day supercomputers. In addition to a pure message-passing approach, two hybrid parallel implementations are introduced based on simultaneous use of message-passing and shared-memory programming models. For a modern SMP cluster (HITACHI SR8000) performance and scalability of the hybrid implementations are discussed and compared with the pure message-passing approach on massively-parallel systems (CRAY T3E), vector computers (NEC SX5e) and distributed shared-memory systems (SGI Origin3800).