Towards making autotuning mainstream

Authors:
Protonu Basu;Mary Hall;Malik Khan;Suchit Maindola;Saurav Muralidharan;Shreyas Ramalingam;Axel Rivera;Manu Shantharam;Anand Venkat
Affiliations:
School of Computing, University of Utah, Salt Lake City, UT, USA;School of Computing, University of Utah, Salt Lake City, UT, USA;National University of Science and Technology, Islamabad, Pakistan;School of Computing, University of Utah, Salt Lake City, UT, USA;School of Computing, University of Utah, Salt Lake City, UT, USA;School of Computing, University of Utah, Salt Lake City, UT, USA;School of Computing, University of Utah, Salt Lake City, UT, USA;School of Computing, University of Utah, Salt Lake City, UT, USA;School of Computing, University of Utah, Salt Lake City, UT, USA
Venue:
International Journal of High Performance Computing Applications
Year:
2013

Citing 30
Cited 0

Loop skewing: the wavefront method revisited

International Journal of Parallel Programming
Probabilistic reasoning in intelligent systems: networks of plausible inference

Probabilistic reasoning in intelligent systems: networks of plausible inference
Branch and bound algorithm selection by performance prediction

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Algorithm portfolios

Artificial Intelligence - special issue on computational tradeoffs under bounded resources
Modern C++ design: generic programming and design patterns applied

Modern C++ design: generic programming and design patterns applied
Algorithm Selection using Reinforcement Learning

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
OCEANS - Optimising Compilers for Embedded Applications

Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Active harmony: towards automated performance tuning

Proceedings of the 2002 ACM/IEEE conference on Supercomputing
A framework for adaptive algorithm selection in STAPL

Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
Statistical Models for Empirical Search-Based Performance Tuning

International Journal of High Performance Computing Applications
Using Machine Learning to Focus Iterative Optimization

Proceedings of the International Symposium on Code Generation and Optimization
The Tau Parallel Performance System

International Journal of High Performance Computing Applications
Semi-automatic composition of loop transformations for deep parallelism and memory hierarchies

International Journal of Parallel Programming
Loop Optimization using Hierarchical Compilation and Kernel Decomposition

Proceedings of the International Symposium on Code Generation and Optimization
Model-guided empirical optimization for memory hierarchy

Model-guided empirical optimization for memory hierarchy
A tuning framework for software-managed memory hierarchies

Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Roofline: an insightful visual performance model for multicore architectures

Communications of the ACM - A Direct Path to Dependable Software
PetaBricks: a language and compiler for algorithmic choice

Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
A portfolio approach to algorithm select

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Speeding up Nek5000 with autotuning and specialization

Proceedings of the 24th ACM International Conference on Supercomputing
Model-guided empirical tuning of loop fusion

International Journal of High Performance Systems Architecture
STAPL: standard template adaptive parallel library

Proceedings of the 3rd Annual Haifa Experimental Systems Conference
OpenMPC: Extended OpenMP Programming and Tuning for GPUs

Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Generating Performance Bounds from Source Code

ICPPW '10 Proceedings of the 2010 39th International Conference on Parallel Processing Workshops
The pochoir stencil compiler

Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures
Auto-tuning full applications: A case study

International Journal of High Performance Computing Applications
PATUS: A Code Generation and Autotuning Framework for Parallel Iterative Stencil Computations on Modern Microarchitectures

IPDPS '11 Proceedings of the 2011 IEEE International Parallel & Distributed Processing Symposium
POET: a scripting language for applying parameterized source-to-source program transformations

Software—Practice & Experience
Improving High-Performance Sparse Libraries Using Compiler-Assisted Specialization: A PETSc Case Study

IPDPSW '12 Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum
Autotuning Stencil-Based Computations on GPUs

CLUSTER '12 Proceedings of the 2012 IEEE International Conference on Cluster Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Autotuning systems employ empirical techniques to evaluate the suitability of a search space of possible implementations of a computation. Autotuning has emerged as a critical strategy for achieving high performance as architectural complexity grows. Present-day autotuning technology augments the capabilities of expert users or is hidden inside compilers, but to date has not been adopted as a mainstream technology. Based on our prior experience and the experience of others in developing autotuning technology and applying it to libraries and applications, this paper examines some of the barriers to adoption of the technology and future research areas to break down these barriers.