Loop skewing: the wavefront method revisited
International Journal of Parallel Programming
Probabilistic reasoning in intelligent systems: networks of plausible inference
Probabilistic reasoning in intelligent systems: networks of plausible inference
Branch and bound algorithm selection by performance prediction
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Artificial Intelligence - special issue on computational tradeoffs under bounded resources
Modern C++ design: generic programming and design patterns applied
Modern C++ design: generic programming and design patterns applied
Algorithm Selection using Reinforcement Learning
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
OCEANS - Optimising Compilers for Embedded Applications
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Active harmony: towards automated performance tuning
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
A framework for adaptive algorithm selection in STAPL
Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
Statistical Models for Empirical Search-Based Performance Tuning
International Journal of High Performance Computing Applications
Using Machine Learning to Focus Iterative Optimization
Proceedings of the International Symposium on Code Generation and Optimization
The Tau Parallel Performance System
International Journal of High Performance Computing Applications
Semi-automatic composition of loop transformations for deep parallelism and memory hierarchies
International Journal of Parallel Programming
Loop Optimization using Hierarchical Compilation and Kernel Decomposition
Proceedings of the International Symposium on Code Generation and Optimization
Model-guided empirical optimization for memory hierarchy
Model-guided empirical optimization for memory hierarchy
A tuning framework for software-managed memory hierarchies
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Roofline: an insightful visual performance model for multicore architectures
Communications of the ACM - A Direct Path to Dependable Software
PetaBricks: a language and compiler for algorithmic choice
Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
A portfolio approach to algorithm select
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Speeding up Nek5000 with autotuning and specialization
Proceedings of the 24th ACM International Conference on Supercomputing
Model-guided empirical tuning of loop fusion
International Journal of High Performance Systems Architecture
STAPL: standard template adaptive parallel library
Proceedings of the 3rd Annual Haifa Experimental Systems Conference
OpenMPC: Extended OpenMP Programming and Tuning for GPUs
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Generating Performance Bounds from Source Code
ICPPW '10 Proceedings of the 2010 39th International Conference on Parallel Processing Workshops
Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures
Auto-tuning full applications: A case study
International Journal of High Performance Computing Applications
IPDPS '11 Proceedings of the 2011 IEEE International Parallel & Distributed Processing Symposium
POET: a scripting language for applying parameterized source-to-source program transformations
Software—Practice & Experience
IPDPSW '12 Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum
Autotuning Stencil-Based Computations on GPUs
CLUSTER '12 Proceedings of the 2012 IEEE International Conference on Cluster Computing
Hi-index | 0.00 |
Autotuning systems employ empirical techniques to evaluate the suitability of a search space of possible implementations of a computation. Autotuning has emerged as a critical strategy for achieving high performance as architectural complexity grows. Present-day autotuning technology augments the capabilities of expert users or is hidden inside compilers, but to date has not been adopted as a mainstream technology. Based on our prior experience and the experience of others in developing autotuning technology and applying it to libraries and applications, this paper examines some of the barriers to adoption of the technology and future research areas to break down these barriers.