Communications of the ACM
A bridging model for parallel computation
Communications of the ACM
The effect of time constraints on scaled speedup
SIAM Journal on Scientific and Statistical Computing
Scalability of parallel machines
Communications of the ACM
A static performance estimator to guide data partitioning decisions
PPOPP '91 Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
ICS '92 Proceedings of the 6th international conference on Supercomputing
ATOM: a system for building customized program analysis tools
PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
A scalability test for parallel code
Software—Practice & Experience
LogP: a practical model of parallel computation
Communications of the ACM
Critical Path Profiling of Message Passing and Shared-Memory Programs
IEEE Transactions on Parallel and Distributed Systems
Architectural requirements and scalability of the NAS parallel benchmarks
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
A scalable cross-platform infrastructure for application performance tuning using hardware counters
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Predictive performance and scalability modeling of a large-scale application
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
DiP: A Parallel Program Development Environment
Euro-Par '96 Proceedings of the Second International Euro-Par Conference on Parallel Processing-Volume II
A framework for performance modeling and prediction
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
SvPablo: A Multi-Language Architecture-Independent Performance Analysis System
ICPP '99 Proceedings of the 1999 International Conference on Parallel Processing
Cross-architecture performance predictions for scientific applications using parameterized models
Proceedings of the joint international conference on Measurement and modeling of computer systems
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
An API for Runtime Code Patching
International Journal of High Performance Computing Applications
Cross-Platform Performance Prediction of Parallel Applications Using Partial Execution
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Performance characterization of molecular dynamics techniques for biomolecular simulations
Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
On-line automated performance diagnosis on thousands of processes
Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Methods of inference and learning for performance modeling of parallel applications
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Hierarchical model validation of symbolic performance models of scientific kernels
Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
An approach to performance prediction for parallel applications
Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
On the Performance of Transparent MPI Piggyback Messages
Proceedings of the 15th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Mapping parallelism to multi-cores: a machine learning based approach
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
A Hybrid Intelligent Method for Performance Modeling and Prediction of Workflow Activities in Grids
CCGRID '09 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Diagnosis, Tuning, and Redesign for Multicore Performance: A Case Study of the Fast Multipole Method
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
A workload-aware mapping approach for data-parallel programs
Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers
Parkour: parallel speedup estimates for serial programs
HotPar'11 Proceedings of the 3rd USENIX conference on Hot topic in parallelism
Kismet: parallel speedup estimates for serial programs
Proceedings of the 2011 ACM international conference on Object oriented programming systems languages and applications
Performance Modeling and Comparative Analysis of the MILC Lattice QCD Application su3_rmd
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Host load prediction in a Google compute cloud with a Bayesian model
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Extending the BT NAS parallel benchmark to exascale computing
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
WuKong: effective diagnosis of bugs at large system scales
Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
WuKong: automatically detecting and localizing bugs that manifest at large system scales
Proceedings of the 22nd international symposium on High-performance parallel and distributed computing
Using automated performance modeling to find scalability bugs in complex codes
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Google hostload prediction based on Bayesian model with optimized feature combination
Journal of Parallel and Distributed Computing
Exploiting GPU Hardware Saturation for Fast Compiler Optimization
Proceedings of Workshop on General Purpose Processing Using GPUs
Predicting execution time of machine learning tasks for scheduling
International Journal of Hybrid Intelligent Systems
Hi-index | 0.00 |
Many applied scientific domains are increasingly relying on large-scale parallel computation. Consequently, many large clusters now have thousands of processors. However, the ideal number of processors to use for these scientific applications varies with both the input variables and the machine under consideration, and predicting this processor count is rarely straightforward. Accurate prediction mechanisms would provide many benefits, including improving cluster efficiency and identifying system configuration or hardware issues that impede performance. We explore novel regression-based approaches to predict parallel program scalability. We use several program executions on a small subset of the processors to predict execution time on larger numbers of processors. We compare three different regression-based techniques: one based on execution time only; another that uses per-processor information only; and a third one based on the global critical path. These techniques provide accurate scaling predictions, with median prediction errors between 6.2% and 17.3% for seven applications.