Landing CG on EARTH: a case study of fine-grained multithreading on an evolutionary path
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Optimizing threaded MPI execution on SMP clusters
ICS '01 Proceedings of the 15th international conference on Supercomputing
OpenMP on networks of workstations
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Designing and Building Parallel Programs: Concepts and Tools for Parallel Software Engineering
Designing and Building Parallel Programs: Concepts and Tools for Parallel Software Engineering
Achieving Performance Portability with SKaMPI for High-Performance MPI Programs
ICCS '01 Proceedings of the International Conference on Computational Science-Part II
PM-PVM: A Portable Multithreaded PVM
IPPS '99/SPDP '99 Proceedings of the 13th International Symposium on Parallel Processing and the 10th Symposium on Parallel and Distributed Processing
Implementation and Evaluation of MPI on an SMP Cluster
Proceedings of the 11 IPPS/SPDP'99 Workshops Held in Conjunction with the 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing
Performance of Cluster-enabled OpenMP for the SCASH Software Distributed Shared Memory System
CCGRID '03 Proceedings of the 3st International Symposium on Cluster Computing and the Grid
Earth: an efficient architecture for running threads
Earth: an efficient architecture for running threads
ParADE: An OpenMP Programming Environment for SMP Cluster Systems
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Implementing parallel conjugate gradient on the EARTH multithreaded architecture
CLUSTER '04 Proceedings of the 2004 IEEE International Conference on Cluster Computing
A study of the on-chip interconnection network for the IBM Cyclops64 multi-core architecture
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Hi-index | 0.00 |
With the rapidly increasing diversity of parallel architectures and the increasing time and labor for developing parallel applications, the performance portability of parallel programs is becoming increasingly important and should be considered when designing parallel execution models, APIs, and runtime system software. This paper analyzes both code portability and performance portability of parallel programs based on the EARTH model - an event-driven fine-grain multi-threaded execution and architecture model. We discuss several design considerations of the EARTH system that contribute to the performance portability of parallel applications. Experiments of four representative benchmarks are conducted on several different parallel architectures, including two clusters listed in the 23rd supercomputer TOP500 list. The results demonstrate that EARTH based programs can achieve robust performance portability across the selected hardware platforms without any code modification or tuning.