The super-actor machine: a hybrid dataflow/Von Neumann architecture
The super-actor machine: a hybrid dataflow/Von Neumann architecture
Using MPI: portable parallel programming with the message-passing interface
Using MPI: portable parallel programming with the message-passing interface
PVM: Parallel virtual machine: a users' guide and tutorial for networked parallel computing
PVM: Parallel virtual machine: a users' guide and tutorial for networked parallel computing
A design study of the EARTH multiprocessor
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
Parallel programming with MPI
Experiences with non-numeric applications on multithreaded architectures
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Programming with POSIX threads
Programming with POSIX threads
Optimization of MPI collectives on clusters of large-scale SMP's
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Advances in the dataflow computational model
Parallel Computing - Special Anniversary issue
Multithreaded algorithms for the fast Fourier transform
Proceedings of the twelfth annual ACM symposium on Parallel algorithms and architectures
Parallel programming in OpenMP
Parallel programming in OpenMP
Landing CG on EARTH: a case study of fine-grained multithreading on an evolutionary path
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
MPI versus MPI+OpenMP on IBM SP for the NAS benchmarks
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Optimizing threaded MPI execution on SMP clusters
ICS '01 Proceedings of the 15th international conference on Supercomputing
OpenMP on networks of workstations
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Designing and Building Parallel Programs: Concepts and Tools for Parallel Software Engineering
Designing and Building Parallel Programs: Concepts and Tools for Parallel Software Engineering
A preliminary architecture for a basic data-flow processor
ISCA '75 Proceedings of the 2nd annual symposium on Computer architecture
Achieving Performance Portability with SKaMPI for High-Performance MPI Programs
ICCS '01 Proceedings of the International Conference on Computational Science-Part II
PM-PVM: A Portable Multithreaded PVM
IPPS '99/SPDP '99 Proceedings of the 13th International Symposium on Parallel Processing and the 10th Symposium on Parallel and Distributed Processing
Implementation and Evaluation of MPI on an SMP Cluster
Proceedings of the 11 IPPS/SPDP'99 Workshops Held in Conjunction with the 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing
Multithreaded Algorithms for Pricing a Class of Complex Options
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
CableS: Thread Control and Memory System Extensions for Shared Virtual Memory Clusters
WOMPAT '01 Proceedings of the International Workshop on OpenMP Applications and Tools: OpenMP Shared Memory Parallel Programming
Performance of Cluster-enabled OpenMP for the SCASH Software Distributed Shared Memory System
CCGRID '03 Proceedings of the 3st International Symposium on Cluster Computing and the Grid
Earth: an efficient architecture for running threads
Earth: an efficient architecture for running threads
Journal of Parallel and Distributed Computing
ParADE: An OpenMP Programming Environment for SMP Cluster Systems
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Implementing parallel conjugate gradient on the EARTH multithreaded architecture
CLUSTER '04 Proceedings of the 2004 IEEE International Conference on Cluster Computing
Development of mixed mode MPI / OpenMP applications
Scientific Programming
Hi-index | 0.00 |
Due to the increase of the diversity of parallel architectures, and the increasing development time for parallel applications, performance portability has become one of the major considerations when designing the next generation of parallel program execution models, APIs, and runtime system software. This paper analyzes both code portability and performance portability of parallel programs for fine-grained multi-threaded execution and architecture models. We concentrate on one particular event-driven fine-grained multi-threaded execution model--EARTH, and discuss several design considerations of the EARTH model and runtime system that contribute to the performance portability of parallel applications. We believe that these are important issues for future high end computing system software design. Four representative benchmarks were conducted on several different parallel architectures, including two clusters listed in the 23rd supercomputer TOP500 list. The results demonstrate that EARTH based programs can achieve robust performance portability across the selected hardware platforms without any code modification or tuning.