Analyzing scheduling policies using Dimemas
Parallel Computing - Special double issue on environment and tools for parallel scientific computing
Statistical scalability analysis of communication operations in distributed applications
PPoPP '01 Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming
Predictive performance and scalability modeling of a large-scale application
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Automatically characterizing large scale program behavior
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
A framework for performance modeling and prediction
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Efficiently exploring architectural design spaces via predictive modeling
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Performance Modeling of the Blue Gene Architecture
JVA '06 Proceedings of the IEEE John Vincent Atanasoff 2006 International Symposium on Modern Computing
The structural simulation toolkit: exploring novel architectures
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Parallel Program Trace Extrapolation
ICPP '94 Proceedings of the 1994 International Conference on Parallel Processing - Volume 02
A study of process arrival patterns for MPI collective operations
Proceedings of the 21st annual international conference on Supercomputing
Preserving time in large-scale communication traces
Proceedings of the 22nd annual international conference on Supercomputing
Using MPI Communication Patterns to Guide Source Code Transformations
ICCS '08 Proceedings of the 8th international conference on Computational Science, Part III
Detecting Patterns in MPI Communication Traces
ICPP '08 Proceedings of the 2008 37th International Conference on Parallel Processing
ScalaTrace: Scalable compression and replay of communication traces for high-performance computing
Journal of Parallel and Distributed Computing
FACT: fast communication trace collection for parallel applications through program slicing
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Construction and evaluation of coordinated performance skeletons
HiPC'08 Proceedings of the 15th international conference on High performance computing
Performance modeling: understanding the past and predicting the future
Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Introducing the open trace format (OTF)
ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part II
Automatic generation of executable communication specifications from parallel applications
Proceedings of the international conference on Supercomputing
Vrisha: using scaling properties of parallel programs for bug detection and localization
Proceedings of the 20th international symposium on High performance distributed computing
ScalaTrace: tracing, analysis and modeling of HPC codes at scale
PARA'10 Proceedings of the 10th international conference on Applied Parallel and Scientific Computing - Volume 2
Auto-generation of communication benchmark traces
ACM SIGMETRICS Performance Evaluation Review
Extending the BT NAS parallel benchmark to exascale computing
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Elastic and scalable tracing and accurate replay of non-deterministic events
Proceedings of the 27th international ACM conference on International conference on supercomputing
Hi-index | 0.00 |
Performance modeling for scientific applications is important for assessing potential application performance and systems procurement in high-performance computing (HPC). Recent progress on communication tracing opens up novel opportunities for communication modeling due to its lossless yet scalable trace collection. Estimating the impact of scaling on communication efficiency still remains non-trivial due to execution-time variations and exposure to hardware and software artifacts. This work contributes a fundamentally novel modeling scheme. We synthetically generate the application trace for large numbers of nodes by extrapolation from a set of smaller traces. We devise an innovative approach for topology extrapolation of single program, multiple data (SPMD) codes with stencil or mesh communication. The extrapolated trace can subsequently be (a) replayed to assess communication requirements before porting an application, (b) transformed to auto-generate communication benchmarks for various target platforms, and (c) analyzed to detect communication inefficiencies and scalability limitations. To the best of our knowledge, rapidly obtaining the communication behavior of parallel applications at arbitrary scale with the availability of timed replay, yet without actual execution of the application at this scale is without precedence and has the potential to enable otherwise infeasible system simulation at the exascale level.