A regression-based approach to scalability prediction

Authors:
Bradley J. Barnes;Barry Rountree;David K. Lowenthal;Jaxk Reeves;Bronis de Supinski;Martin Schulz
Affiliations:
University of Georgia, Athens, GA, USA;University of Georgia, Athens, GA, USA;University of Georgia, Athens, GA, USA;University of Georgia, Athens, GA, USA;Lawrence Livermore National Laboratory, Livermore, CA, USA;Lawrence Livermore National Laboratory, Livermore, CA, USA
Venue:
Proceedings of the 22nd annual international conference on Supercomputing
Year:
2008

Citing 26
Cited 17

Reevaluating Amdahl's law

Communications of the ACM
A bridging model for parallel computation

Communications of the ACM
The effect of time constraints on scaled speedup

SIAM Journal on Scientific and Statistical Computing
Scalability of parallel machines

Communications of the ACM
A static performance estimator to guide data partitioning decisions

PPOPP '91 Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
A scalable, object-oriented finite element solver for partial differential equations on multicomputers

ICS '92 Proceedings of the 6th international conference on Supercomputing
ATOM: a system for building customized program analysis tools

PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
A scalability test for parallel code

Software—Practice & Experience
LogP: a practical model of parallel computation

Communications of the ACM
Critical Path Profiling of Message Passing and Shared-Memory Programs

IEEE Transactions on Parallel and Distributed Systems
Architectural requirements and scalability of the NAS parallel benchmarks

SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
A scalable cross-platform infrastructure for application performance tuning using hardware counters

Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Predictive performance and scalability modeling of a large-scale application

Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Scaling Parallel Programs for Multiprocessors: Methodology and Examples

Computer
DiP: A Parallel Program Development Environment

Euro-Par '96 Proceedings of the Second International Euro-Par Conference on Parallel Processing-Volume II
A framework for performance modeling and prediction

Proceedings of the 2002 ACM/IEEE conference on Supercomputing
SvPablo: A Multi-Language Architecture-Independent Performance Analysis System

ICPP '99 Proceedings of the 1999 International Conference on Parallel Processing
Cross-architecture performance predictions for scientific applications using parameterized models

Proceedings of the joint international conference on Measurement and modeling of computer systems
The Case of the Missing Supercomputer Performance: Achieving Optimal Performance on the 8,192 Processors of ASCI Q

Proceedings of the 2003 ACM/IEEE conference on Supercomputing
An API for Runtime Code Patching

International Journal of High Performance Computing Applications
Cross-Platform Performance Prediction of Parallel Applications Using Partial Execution

SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Performance characterization of molecular dynamics techniques for biomolecular simulations

Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
On-line automated performance diagnosis on thousands of processes

Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Methods of inference and learning for performance modeling of parallel applications

Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Hierarchical model validation of symbolic performance models of scientific kernels

Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
An approach to performance prediction for parallel applications

Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing

On the Performance of Transparent MPI Piggyback Messages

Proceedings of the 15th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Mapping parallelism to multi-cores: a machine learning based approach

Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
A Hybrid Intelligent Method for Performance Modeling and Prediction of Workflow Activities in Grids

CCGRID '09 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid
PHANTOM: predicting performance of parallel applications on large-scale parallel machines using a single node

Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Diagnosis, Tuning, and Redesign for Multicore Performance: A Case Study of the Fast Multipole Method

Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
A workload-aware mapping approach for data-parallel programs

Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers
Parkour: parallel speedup estimates for serial programs

HotPar'11 Proceedings of the 3rd USENIX conference on Hot topic in parallelism
Kismet: parallel speedup estimates for serial programs

Proceedings of the 2011 ACM international conference on Object oriented programming systems languages and applications
Performance Modeling and Comparative Analysis of the MILC Lattice QCD Application su3_rmd

CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Host load prediction in a Google compute cloud with a Bayesian model

SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Extending the BT NAS parallel benchmark to exascale computing

SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
WuKong: effective diagnosis of bugs at large system scales

Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
WuKong: automatically detecting and localizing bugs that manifest at large system scales

Proceedings of the 22nd international symposium on High-performance parallel and distributed computing
Using automated performance modeling to find scalability bugs in complex codes

SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Google hostload prediction based on Bayesian model with optimized feature combination

Journal of Parallel and Distributed Computing
Exploiting GPU Hardware Saturation for Fast Compiler Optimization

Proceedings of Workshop on General Purpose Processing Using GPUs
Predicting execution time of machine learning tasks for scheduling

International Journal of Hybrid Intelligent Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many applied scientific domains are increasingly relying on large-scale parallel computation. Consequently, many large clusters now have thousands of processors. However, the ideal number of processors to use for these scientific applications varies with both the input variables and the machine under consideration, and predicting this processor count is rarely straightforward. Accurate prediction mechanisms would provide many benefits, including improving cluster efficiency and identifying system configuration or hardware issues that impede performance. We explore novel regression-based approaches to predict parallel program scalability. We use several program executions on a small subset of the processors to predict execution time on larger numbers of processors. We compare three different regression-based techniques: one based on execution time only; another that uses per-processor information only; and a third one based on the global critical path. These techniques provide accurate scaling predictions, with median prediction errors between 6.2% and 17.3% for seven applications.