Probabilistic auto-tuning for architectures with complex constraints

Authors:
Benjamin Ylvisaker;Scott Hauck
Affiliations:
GrammaTech, Inc.;University of Washington
Venue:
Proceedings of the 1st International Workshop on Adaptive Self-Tuning Computing Systems for the Exaflop Era
Year:
2011

Citing 18
Cited 0

Optimizing matrix multiply using PHiPAC: a portable, high-performance, ANSI C coding methodology

ICS '97 Proceedings of the 11th international conference on Supercomputing
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond

Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
Adaptive Optimizing Compilers for the 21st Century

The Journal of Supercomputing
Uniformity Testing Using Minimal Spanning Tree

ICPR '02 Proceedings of the 16 th International Conference on Pattern Recognition (ICPR'02) Volume 4 - Volume 4
ACME: adaptive compilation made efficient

LCTES '05 Proceedings of the 2005 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Statistical Models for Empirical Search-Based Performance Tuning

International Journal of High Performance Computing Applications
Think globally, search locally

Proceedings of the 19th annual international conference on Supercomputing
In search of a program generator to implement generic transformations for high-performance computing

Science of Computer Programming - Special issue on the first MetaOCaml workshop 2004
Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning)

Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning)
Method-specific dynamic compilation using logistic regression

Proceedings of the 21st annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and applications
SPR: an architecture-adaptive CGRA mapping tool

Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
PetaBricks: a language and compiler for algorithmic choice

Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
Model-guided autotuning of high-productivity languages for petascale computing

Proceedings of the 18th ACM international symposium on High performance distributed computing
A scalable auto-tuning framework for compiler optimization

IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
Annotation-based empirical performance tuning using Orio

IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
Atune-IL: An Instrumentation Language for Auto-tuning Parallel Applications

Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
Elliptic Gabriel graph for finding neighbors in a point set and its application to normal vector estimation

Computer-Aided Design
"c-level" programming of parallel coprocessor accelerators

"c-level" programming of parallel coprocessor accelerators

Quantified Score

Hi-index	0.00

Visualization

Abstract

It is hard to optimize applications for coprocessor accelerator architectures, like FPGAs and GPUs, because application parameters must be tuned carefully to the size of the target architecture. Moreover, some combinations of parameters simply do not work, because they lead to overuse of a constrained resource. Applying auto-tuning---the use of search algorithms and empirical feedback to optimize programs---is an attractive solution, but tuning in the presence of unpredictable failures is not addressed well by existing auto-tuning methods. This paper describes a new auto-tuning method that is based on probabilistic predictions of multiple program features (run time, memory consumption, etc.). During configuration selection, these predictions are combined to balance the preference for trying configurations that are likely to be high quality against the preference for trying configurations that are likely to satisfy all constraints. In our experiments, our new auto-tuning method performed substantially better than the simpler approach of treating all failed configurations as if they succeed with a "very low" quality. In many cases, the simpler strategy required more than twice as many trials to reach the same quality level in our experiments.