Accurate performance evaluation, modelling and prediction of a message passing simulation code based on middleware

Authors:
Michela Taufer;Thomas Stricker
Affiliations:
Swiss Institute of Technology (ETH), CH-8092 Zuerich, Switzerland;Swiss Institute of Technology (ETH), CH-8092 Zuerich, Switzerland
Venue:
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Year:
1998

Citing 5
Cited 3

Design choices in the SHRIMP system: an empirical study

Proceedings of the 25th annual international symposium on Computer architecture
Sciddle 4.0, or, Remote Procedure Calles in PVM

HPCN Europe 1996 Proceedings of the International Conference and Exhibition on High-Performance Computing and Networking
Dynamic Coscheduling on Workstation Clusters

IPPS/SPDP '98 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Molecular Dynamics Simulations on Cray Clusters using the SCIDDLE-PVM environment

EuroPVM '96 Proceedings of the Third European PVM Conference on Parallel Virtual Machine
Architectural Implications of a Family of Irregular Applications

HPCA '98 Proceedings of the 4th International Symposium on High-Performance Computer Architecture

Scalability and resource usage of an OLAP benchmark on clusters of PCs

Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures
Performance Characterization of a Molecular Dynamics Code on PC Clusters: Is There Any Easy Parallelism in CHARMM?

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Performance evaluation of distributed database on PC cluster computers

WSEAS Transactions on Computers

Quantified Score

Hi-index	0.00

Visualization

Abstract

In distributed and vectorized computing there is a large number of highly different supercomputing platforms an application could run on. Therefore most traditional parallel codes are ill equipped to collect data about their resource usage or their behavior at run time and the corresponding data are rarely published and few scientists attack the planning of an application and its platform systematically. As an improvement over the current state of the art, we propose an integrated approach to performance evaluation, modeling and prediction for different platforms. Our approach uses a combination of analytical modeling and systematically designed experimentation with full application runs, reduced application kernels and some benchmarks. We studied our methodology of performance assessment with Opal, an example code in molecular biology, developed at our institution to run on our four Cray J90 ``Classic" Vector SMPs. Besides a detailed assessment of performance achieved on the J90s, the primary goal of our study was to find the most suitable and most cost effective hardware platform for the application, in particular to check the suitability of this application for slow CoPs, SMP CoPs and fast CoPs, three flavors of Clusters of PCs built with off-the-shelf Intel Pentium processors. A performance assessment based on our model is much easier than porting and parallelizing the application for a new target machine and so we could easily obtain and include performance estimates for a T3E-900, a high end MPP system. The predicted execution times and speedup figures indicate that a well designed cluster of PCs achieves similar if not better performance than the J90 vector processors currently used and that the computational efficiency compares favorably to the T3E-900 for that particular application code.