Cross-Platform Performance Prediction of Parallel Applications Using Partial Execution

Authors:
Leo T. Yang;Xiaosong Ma;Frank Mueller
Affiliations:
North Carolina State University, Raleigh;Oak Ridge National Laboratory;North Carolina State University, Raleigh
Venue:
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Year:
2005

Citing 31
Cited 24

LogP: towards a realistic model of parallel computation

PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
A static parameter based performance prediction tool for parallel programs

ICS '93 Proceedings of the 7th international conference on Supercomputing
Multiprocessor scalability predictions through detailed program execution analysis

ICS '95 Proceedings of the 9th international conference on Supercomputing
Performance Models for the Processor Farm Paradigm

IEEE Transactions on Parallel and Distributed Systems
Automated performance prediction for scalable parallel computing

Parallel Computing
Performance prediction of large parallel applications using parallel simulations

Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Adaptive performance prediction for distributed data-intensive applications

SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
A scalable cross-platform infrastructure for application performance tuning using hardware counters

Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Utilization, Predictability, Workloads, and User Runtime Estimates in Scheduling the IBM SP2 with Backfilling

IEEE Transactions on Parallel and Distributed Systems
Predictive performance and scalability modeling of a large-scale application

Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Symbolic Performance Modeling of Parallel Systems

IEEE Transactions on Parallel and Distributed Systems
Achieving Performance Portability with SKaMPI for High-Performance MPI Programs

ICCS '01 Proceedings of the International Conference on Computational Science-Part II
Performance Forecasting: Towards a Methodology for Characterizing Large Computational Applications

ICPP '98 Proceedings of the 1998 International Conference on Parallel Processing
Predicting Application Run Times Using Historical Information

IPPS/SPDP '98 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Accurate Performance Prediction for Assively Parallel Systems and Its Applications

Euro-Par '96 Proceedings of the Second International Euro-Par Conference on Parallel Processing-Volume II
A framework for performance modeling and prediction

Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Prophesy: an infrastructure for performance analysis and modeling of parallel and grid applications

ACM SIGMETRICS Performance Evaluation Review
Research Directions in Parallel I/O for Clusters

CLUSTER '02 Proceedings of the IEEE International Conference on Cluster Computing
Improving MPI-IO Output Performance with Active Buffering Plus Threads

IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
Performance Prediction and Its Use in Parallel and Distributed Computing Systems

IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
A Comparison between the Earth Simulator and AlphaServer Systems Using Predictive Application Performance Models

IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
Characterizing and Predicting Program Behavior and its Variability

Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques
Communication characteristics of large-scale scientific applications for contemporary cluster architectures

Journal of Parallel and Distributed Computing - Special section best papers from the 2002 international parallel and distributed processing symposium
Cross-architecture performance predictions for scientific applications using parameterized models

Proceedings of the joint international conference on Measurement and modeling of computer systems
EXPERT: expedited simulation exploiting program behavior repetition

Proceedings of the 18th annual international conference on Supercomputing
GYRO: A 5-D Gyrokinetic-Maxwell Solver

Proceedings of the 2004 ACM/IEEE conference on Supercomputing
Job Superscheduler Architecture and Performance in Computational Grid Environments

Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Application Representations for Multiparadigm Performance Modeling of Large-Scale Parallel Scientific Codes

International Journal of High Performance Computing Applications
Improved automatic testcase synthesis for performance model validation

Proceedings of the 19th annual international conference on Supercomputing
Bandwidth-aware co-allocating meta-schedulers for mini-grid architectures

CLUSTER '04 Proceedings of the 2004 IEEE International Conference on Cluster Computing
Performance modeling: understanding the past and predicting the future

Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing

Methods of inference and learning for performance modeling of parallel applications

Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Performance prediction with skeletons

Cluster Computing
A regression-based approach to scalability prediction

Proceedings of the 22nd annual international conference on Supercomputing
Platform-independent modeling and prediction of application resource usage characteristics

Journal of Systems and Software
PHANTOM: predicting performance of parallel applications on large-scale parallel machines using a single node

Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Energy-efficient parallel software for mobile hand-held devices

HotPar'09 Proceedings of the First USENIX conference on Hot topics in parallelism
Statistical Power and Performance Modeling for Optimizing the Energy Efficiency of Scientific Computing

GREENCOM-CPSCOM '10 Proceedings of the 2010 IEEE/ACM Int'l Conference on Green Computing and Communications & Int'l Conference on Cyber, Physical and Social Computing
An idiom-finding tool for increasing productivity of accelerators

Proceedings of the international conference on Supercomputing
GROPHECY: GPU performance projection from CPU code skeletons

Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Adaptive Executions of Multi-Physics Coupled Applications on Batch Grids

Journal of Grid Computing
Hierarchical model validation of symbolic performance models of scientific kernels

Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
Compiler-Directed performance model construction for parallel programs

ARCS'10 Proceedings of the 23rd international conference on Architecture of Computing Systems
Hirundo: a mechanism for automated production of optimized data stream graphs

ICPE '12 Proceedings of the 3rd ACM/SPEC International Conference on Performance Engineering
Coordinated rescheduling of Bag-of-Tasks for executions on multiple resource providers

Concurrency and Computation: Practice & Experience
Reducing the time to tune parallel dense linear algebra routines with partial execution and performance modeling

PPAM'11 Proceedings of the 9th international conference on Parallel Processing and Applied Mathematics - Volume Part I
Dataflow-driven GPU performance projection for multi-kernel transformations

SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Optimizing performance of automatic training phase for application performance prediction in the grid

HPCC'07 Proceedings of the Third international conference on High Performance Computing and Communications
An exploration of performance attributes for symbolic modeling of emerging processing devices

HPCC'07 Proceedings of the Third international conference on High Performance Computing and Communications
Exploiting VM migration for the automated power and performance management of green cloud computing systems

E2DC'12 Proceedings of the First international conference on Energy Efficient Data Centers
A cost analysis of cloud computing for education

GECON'12 Proceedings of the 9th international conference on Economics of Grids, Clouds, Systems, and Services
ACIC: automatic cloud I/O configurator for HPC applications

SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Using automated performance modeling to find scalability bugs in complex codes

SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Automatic optimization of stream programs via source program operator graph transformations

Distributed and Parallel Databases
Exploiting GPU Hardware Saturation for Fast Compiler Optimization

Proceedings of Workshop on General Purpose Processing Using GPUs

Quantified Score

Hi-index	0.00

Visualization

Abstract

Performance prediction across platforms is increasingly important as developers can choose from a wide range of execution platforms. The main challenge remains to perform accurate predictions at a low-cost across different architectures. In this paper, we derive an affordable method approaching cross-platform performance translation based on relative performance between two platforms. We argue that relative performance can be observed without running a parallel application in full. We show that it suffices to observe very short partial executions of an application since most parallel codes are iterative and behave predictably manner after a minimal startup period. This novel prediction approach is observation-based. It does not require program modeling, code analysis, or architectural simulation. Our performance results using real platforms and production codes demonstrate that prediction derived from partial executions can yield high accuracy at a low cost. We also assess the limitations of our model and identify future research directions on observationbased performance prediction.