An approach to performance prediction for parallel applications

Authors:
Engin Ipek;Bronis R. de Supinski;Martin Schulz;Sally A. McKee
Affiliations:
Computer Systems Lab, School of Electrical and Computer Engineering, Cornell University;Center for Applied Scientific Computing, Lawrence Livermore National Laboratory, Livermore, CA;Center for Applied Scientific Computing, Lawrence Livermore National Laboratory, Livermore, CA;Computer Systems Lab, School of Electrical and Computer Engineering, Cornell University
Venue:
Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Year:
2005

Citing 7
Cited 20

Semicoarsening Multigrid on Distributed Memory Machines

SIAM Journal on Scientific Computing
Machine Learning

Machine Learning
Predictive performance and scalability modeling of a large-scale application

Proceedings of the 2001 ACM/IEEE conference on Supercomputing
hypre: A Library of High Performance Preconditioners

ICCS '02 Proceedings of the International Conference on Computational Science-Part III
Cross-architecture performance predictions for scientific applications using parameterized models

Proceedings of the joint international conference on Measurement and modeling of computer systems
A First-Order Superscalar Processor Model

Proceedings of the 31st annual international symposium on Computer architecture
A performance prediction framework for scientific applications

ICCS'03 Proceedings of the 2003 international conference on Computational science: PartIII

Designing a highly-scalable operating system: the Blue Gene/L story

Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Methods of inference and learning for performance modeling of parallel applications

Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Analysis of input-dependent program behavior using active profiling

Proceedings of the 2007 workshop on Experimental computer science
Analysis of input-dependent program behavior using active profiling

ecs'07 Experimental computer science on Experimental computer science
Efficient architectural design space exploration via predictive modeling

ACM Transactions on Architecture and Code Optimization (TACO)
The blue gene/L supercomputer: a hardware and software story

International Journal of Parallel Programming
A regression-based approach to scalability prediction

Proceedings of the 22nd annual international conference on Supercomputing
Efficient system design space exploration using machine learning techniques

Proceedings of the 45th annual Design Automation Conference
Mapping parallelism to multi-cores: a machine learning based approach

Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
A Hybrid Intelligent Method for Performance Modeling and Prediction of Workflow Activities in Grids

CCGRID '09 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid
Optimizing MPI Runtime Parameter Settings by Using Machine Learning

Proceedings of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Accelerating multi-core simulators

Proceedings of the 2010 ACM Symposium on Applied Computing
Rapid early-stage microarchitecture design using predictive models

ICCD'09 Proceedings of the 2009 IEEE international conference on Computer design
A workload-aware mapping approach for data-parallel programs

Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers
Statistical Power and Performance Modeling for Optimizing the Energy Efficiency of Scientific Computing

GREENCOM-CPSCOM '10 Proceedings of the 2010 IEEE/ACM Int'l Conference on Green Computing and Communications & Int'l Conference on Cyber, Physical and Social Computing
Compiler-Directed performance model construction for parallel programs

ARCS'10 Proceedings of the 23rd international conference on Architecture of Computing Systems
Exploring and Predicting the Effects of Microarchitectural Parameters and Compiler Optimizations on Performance and Energy

ACM Transactions on Embedded Computing Systems (TECS)
Estimating parallel performance

Journal of Parallel and Distributed Computing
Using automated performance modeling to find scalability bugs in complex codes

SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Gunther: search-based auto-tuning of mapreduce

Euro-Par'13 Proceedings of the 19th international conference on Parallel Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Accurately modeling and predicting performance for large-scale applications becomes increasingly difficult as system complexity scales dramatically. Analytic predictive models are useful, but are difficult to construct, usually limited in scope, and often fail to capture subtle interactions between architecture and software. In contrast, we employ multilayer neural networks trained on input data from executions on the target platform. This approach is useful for predicting many aspects of performance, and it captures full system complexity. Our models are developed automatically from the training input set, avoiding the difficult and potentially error-prone process required to develop analytic models. This study focuses on the high-performance, parallel application SMG2000, a much studied code whose variations in execution times are still not well understood. Our model predicts performance on two large-scale parallel platforms within 5%-7% error across a large, multi-dimensional parameter space.