Optimizing for reduced code space using genetic algorithms
Proceedings of the ACM SIGPLAN 1999 workshop on Languages, compilers, and tools for embedded systems
Automatically tuned linear algebra software
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Adaptive Optimizing Compilers for the 21st Century
The Journal of Supercomputing
Learning to Predict Performance from Formula Modeling and Training Data
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
A Machine Learning Approach to Automatic Production of Compiler Heuristics
AIMSA '02 Proceedings of the 10th International Conference on Artificial Intelligence: Methodology, Systems, and Applications
Compiler optimization-space exploration
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Finding effective optimization phase sequences
Proceedings of the 2003 ACM SIGPLAN conference on Language, compiler, and tool for embedded systems
Meta optimization: improving compiler heuristics with machine learning
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Combined Selection of Tile Sizes and Unroll Factors Using Iterative Compilation
PACT '00 Proceedings of the 2000 International Conference on Parallel Architectures and Compilation Techniques
ADAPT: Automated De-Coupled Adaptive Program Transformation
ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Predicting Unroll Factors Using Supervised Classification
Proceedings of the international symposium on Code generation and optimization
A Model-Based Framework: An Approach for Profit-Driven Optimization
Proceedings of the international symposium on Code generation and optimization
Rating Compiler Optimizations for Automatic Performance Tuning
Proceedings of the 2004 ACM/IEEE conference on Supercomputing
Probabilistic source-level optimisation of embedded programs
LCTES '05 Proceedings of the 2005 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Improving virtual machine performance using a cross-run profile repository
OOPSLA '05 Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Using Machine Learning to Focus Iterative Optimization
Proceedings of the International Symposium on Code Generation and Optimization
Fast and Effective Orchestration of Compiler Optimizations for Automatic Performance Tuning
Proceedings of the International Symposium on Code Generation and Optimization
MiBench: A free, commercially representative embedded benchmark suite
WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
The Journal of Supercomputing
Online performance auditing: using hot optimizations without getting burned
Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation
Automatic performance model construction for the fast software exploration of new hardware designs
CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Rapidly Selecting Good Compiler Optimizations using Performance Counters
Proceedings of the International Symposium on Code Generation and Optimization
Automating the construction of compiler heuristics using machine learning
Automating the construction of compiler heuristics using machine learning
Cole: compiler optimization level exploration
Proceedings of the 6th annual IEEE/ACM international symposium on Code generation and optimization
Producing wrong data without doing anything obviously wrong!
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
MiDataSets: creating the conditions for a more realistic evaluation of Iterative optimization
HiPEAC'07 Proceedings of the 2nd international conference on High performance embedded architectures and compilers
A practical method for quickly evaluating program optimizations
HiPEAC'05 Proceedings of the First international conference on High Performance Embedded Architectures and Compilers
Performance optimization on a supercomputer with cTuning and the PGI compiler
Proceedings of the 2nd International Workshop on Adaptive Self-Tuning Computing Systems for the Exaflop Era
Deconstructing iterative optimization
ACM Transactions on Architecture and Code Optimization (TACO)
Continuous learning of compiler heuristics
ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
Finding good optimization sequences covering program space
ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
On the determination of inlining vectors for program optimization
CC'13 Proceedings of the 22nd international conference on Compiler Construction
Inferred Models for Dynamic and Sparse Hardware-Software Spaces
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Hi-index | 0.00 |
Iterative optimization is a popular and efficient research approach to optimize programs using feedback-directed compilation. However, one of the key limitations that prevented widespread use in production compilers and day-to-day practice is the necessity to perform a large number of program runs with the same dataset and environment (architecture, OS, compiler) to test many different combinations of optimizations. In this article, we propose to overcome such a practical obstacle using collective optimization, where the task of optimizing a program or tuning default compiler optimization heuristic leverages the experience of many other users continuously, rather than being performed in isolation, and often redundantly, by each user. During this unobtrusive approach, performance information is sent to a central database after each run and statistically combined with the data from all users to suggest most profitable optimizations for a given program and an architecture, or to gradually improve default optimization level of a compiler for a given architecture. In this article, we address two key challenges of collective optimization. We show that it is possible to simultaneously learn and improve performance while avoiding long training phases. We also demonstrate how to use our approach with static compilers to learn optimizations across multiple datasets and architectures without even a reference run normally needed to compute speedups over the baseline optimization by using static function cloning and dynamic adaptation. We present a novel probabilistic approach based on competition among pairs of optimizations (program reaction to optimizations) to enable optimization knowledge reuse and achieve nearly the best possible iterative optimization performance. We implemented our technique in GCC (widespread production open-source compiler that supports multiple architectures) and connected it to a public collective optimization database at cTuning.org to gather profile and optimization data continuously and transparently in realistic environments ranging from desktop PCs and mobile systems to supercomputers and data centers.