The NAS parallel benchmarks—summary and preliminary results
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Compiler-optimized simulation of large-scale applications on high performance architectures
Journal of Parallel and Distributed Computing - Parallel and Distributed Discrete Event Simulation--An Emerging Technology
PARKBENCH: Methodology, Relations and Results
HPCN Europe 1996 Proceedings of the International Conference and Exhibition on High-Performance Computing and Networking
Using MPI Communication Patterns to Guide Source Code Transformations
ICCS '08 Proceedings of the 8th international conference on Computational Science, Part III
A Simulator for Large-Scale Parallel Computer Architectures
International Journal of Distributed Systems and Technologies
A communications simulation methodology for AMR codes using task dependency analysis
IA^3 '13 Proceedings of the 3rd Workshop on Irregular Applications: Architectures and Algorithms
Hi-index | 0.00 |
The design of high-performance computing architectures requires performance analysis of large-scale parallel applications to derive various parameters concerning hardware design and software development. The process of performance analysis and benchmarking an application can be done in several ways with varying degrees of fidelity. One of the most cost-effective ways is to do a coarse-grained study of large-scale parallel applications through the use of program skeletons. The concept of a "program skeleton" that we discuss in this paper is an abstracted program that is derived from a larger program where source code that is determined to be irrelevant is removed for the purposes of the skeleton. In this work, we develop a semi-automatic approach for extracting program skeletons based on compiler program analysis. We demonstrate correctness of our skeleton extraction process by comparing details from communication traces, as well as show the performance speedup of using skeletons by running simulations in the SST/macro simulator.