Optimizing matrix multiply using PHiPAC: a portable, high-performance, ANSI C coding methodology
ICS '97 Proceedings of the 11th international conference on Supercomputing
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
Adaptive Optimizing Compilers for the 21st Century
The Journal of Supercomputing
Uniformity Testing Using Minimal Spanning Tree
ICPR '02 Proceedings of the 16 th International Conference on Pattern Recognition (ICPR'02) Volume 4 - Volume 4
ACME: adaptive compilation made efficient
LCTES '05 Proceedings of the 2005 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Statistical Models for Empirical Search-Based Performance Tuning
International Journal of High Performance Computing Applications
Think globally, search locally
Proceedings of the 19th annual international conference on Supercomputing
In search of a program generator to implement generic transformations for high-performance computing
Science of Computer Programming - Special issue on the first MetaOCaml workshop 2004
Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning)
Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning)
Method-specific dynamic compilation using logistic regression
Proceedings of the 21st annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and applications
SPR: an architecture-adaptive CGRA mapping tool
Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
PetaBricks: a language and compiler for algorithmic choice
Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
Model-guided autotuning of high-productivity languages for petascale computing
Proceedings of the 18th ACM international symposium on High performance distributed computing
A scalable auto-tuning framework for compiler optimization
IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
Annotation-based empirical performance tuning using Orio
IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
Atune-IL: An Instrumentation Language for Auto-tuning Parallel Applications
Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
"c-level" programming of parallel coprocessor accelerators
"c-level" programming of parallel coprocessor accelerators
Hi-index | 0.00 |
It is hard to optimize applications for coprocessor accelerator architectures, like FPGAs and GPUs, because application parameters must be tuned carefully to the size of the target architecture. Moreover, some combinations of parameters simply do not work, because they lead to overuse of a constrained resource. Applying auto-tuning---the use of search algorithms and empirical feedback to optimize programs---is an attractive solution, but tuning in the presence of unpredictable failures is not addressed well by existing auto-tuning methods. This paper describes a new auto-tuning method that is based on probabilistic predictions of multiple program features (run time, memory consumption, etc.). During configuration selection, these predictions are combined to balance the preference for trying configurations that are likely to be high quality against the preference for trying configurations that are likely to satisfy all constraints. In our experiments, our new auto-tuning method performed substantially better than the simpler approach of treating all failed configurations as if they succeed with a "very low" quality. In many cases, the simpler strategy required more than twice as many trials to reach the same quality level in our experiments.