Algorithms for matrix transposition on Boolean N-cube configured ensemble architecture
SIAM Journal on Matrix Analysis and Applications
Design and performance of a scalable parallel community climate model
Parallel Computing - Special issue: climate and weather modeling
Parallel Algorithms for the Spectral Transform Method
SIAM Journal on Scientific Computing
Data organization and I/O in a parallel ocean circulation model
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Performance of hybrid message-passing and shared-memory parallelism for discrete element modeling
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
MPI versus MPI+OpenMP on IBM SP for the NAS benchmarks
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
IEEE Transactions on Parallel and Distributed Systems
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Development of mixed mode MPI / OpenMP applications
Scientific Programming
Task optimization based on CPU pipeline technique in a multicore system
Computers & Mathematics with Applications
Hi-index | 0.00 |
We evaluate dynamic data remapping on cluster of SMP architectures under OpenMP, MPI, and hybrid paradigms. Traditional method of multi-dimensional array transpose needs an auxiliary array of the same size and a copy back stage. We recently developed an in-place method using vacancy tracking cycles. The vacancy tracking algorithm outperforms the traditional 2-array method as demonstrated by extensive comparisons. Performance of multi-threaded parallelism using OpenMP are first tested with different scheduling methods and different number of threads. Both methods are then parallelized using several parallel paradigms. At node level, pure OpenMP outperforms pure MPI by a factor of 2.76 for vacancy tracking method. Across entire cluster of SMP nodes, by carefully choosing thread numbers, the hybrid MPI/OpenMP implementation outperforms pure MPI by a factor of 3.79 for traditional method and 4.44 for vacancy tracking method, demonstrating the validity of the parallel paradigm of mixing MPI with OpenMP.