Collective optimization: A practical collaborative approach

Authors:
Grigori Fursin;Olivier Temam
Affiliations:
INRIA Saclay, France, Orsay, France;INRIA Saclay, France, Orsay, France
Venue:
ACM Transactions on Architecture and Code Optimization (TACO)
Year:
2010

Citing 29
Cited 6

Optimizing for reduced code space using genetic algorithms

Proceedings of the ACM SIGPLAN 1999 workshop on Languages, compilers, and tools for embedded systems
Automatically tuned linear algebra software

SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Adaptive Optimizing Compilers for the 21st Century

The Journal of Supercomputing
Learning to Predict Performance from Formula Modeling and Training Data

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
A Machine Learning Approach to Automatic Production of Compiler Heuristics

AIMSA '02 Proceedings of the 10th International Conference on Artificial Intelligence: Methodology, Systems, and Applications
Compiler optimization-space exploration

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Finding effective optimization phase sequences

Proceedings of the 2003 ACM SIGPLAN conference on Language, compiler, and tool for embedded systems
Meta optimization: improving compiler heuristics with machine learning

PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Combined Selection of Tile Sizes and Unroll Factors Using Iterative Compilation

PACT '00 Proceedings of the 2000 International Conference on Parallel Architectures and Compilation Techniques
ADAPT: Automated De-Coupled Adaptive Program Transformation

ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Predicting Unroll Factors Using Supervised Classification

Proceedings of the international symposium on Code generation and optimization
A Model-Based Framework: An Approach for Profit-Driven Optimization

Proceedings of the international symposium on Code generation and optimization
Rating Compiler Optimizations for Automatic Performance Tuning

Proceedings of the 2004 ACM/IEEE conference on Supercomputing
Probabilistic source-level optimisation of embedded programs

LCTES '05 Proceedings of the 2005 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Improving virtual machine performance using a cross-run profile repository

OOPSLA '05 Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Using Machine Learning to Focus Iterative Optimization

Proceedings of the International Symposium on Code Generation and Optimization
Fast and Effective Orchestration of Compiler Optimizations for Automatic Performance Tuning

Proceedings of the International Symposium on Code Generation and Optimization
MiBench: A free, commercially representative embedded benchmark suite

WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
Automatic tuning of whole applications using direct search and a performance-based transformation system

The Journal of Supercomputing
Online performance auditing: using hot optimizations without getting burned

Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation
Automatic performance model construction for the fast software exploration of new hardware designs

CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Rapidly Selecting Good Compiler Optimizations using Performance Counters

Proceedings of the International Symposium on Code Generation and Optimization
Automating the construction of compiler heuristics using machine learning

Automating the construction of compiler heuristics using machine learning
Cole: compiler optimization level exploration

Proceedings of the 6th annual IEEE/ACM international symposium on Code generation and optimization
Producing wrong data without doing anything obviously wrong!

Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Portable compiler optimisation across embedded programs and microarchitectures using machine learning

Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
MiDataSets: creating the conditions for a more realistic evaluation of Iterative optimization

HiPEAC'07 Proceedings of the 2nd international conference on High performance embedded architectures and compilers
A practical method for quickly evaluating program optimizations

HiPEAC'05 Proceedings of the First international conference on High Performance Embedded Architectures and Compilers

Performance optimization on a supercomputer with cTuning and the PGI compiler

Proceedings of the 2nd International Workshop on Adaptive Self-Tuning Computing Systems for the Exaflop Era
Deconstructing iterative optimization

ACM Transactions on Architecture and Code Optimization (TACO)
Continuous learning of compiler heuristics

ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
Finding good optimization sequences covering program space

ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
On the determination of inlining vectors for program optimization

CC'13 Proceedings of the 22nd international conference on Compiler Construction
Inferred Models for Dynamic and Sparse Hardware-Software Spaces

MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture

Quantified Score

Hi-index	0.00

Visualization

Abstract

Iterative optimization is a popular and efficient research approach to optimize programs using feedback-directed compilation. However, one of the key limitations that prevented widespread use in production compilers and day-to-day practice is the necessity to perform a large number of program runs with the same dataset and environment (architecture, OS, compiler) to test many different combinations of optimizations. In this article, we propose to overcome such a practical obstacle using collective optimization, where the task of optimizing a program or tuning default compiler optimization heuristic leverages the experience of many other users continuously, rather than being performed in isolation, and often redundantly, by each user. During this unobtrusive approach, performance information is sent to a central database after each run and statistically combined with the data from all users to suggest most profitable optimizations for a given program and an architecture, or to gradually improve default optimization level of a compiler for a given architecture. In this article, we address two key challenges of collective optimization. We show that it is possible to simultaneously learn and improve performance while avoiding long training phases. We also demonstrate how to use our approach with static compilers to learn optimizations across multiple datasets and architectures without even a reference run normally needed to compute speedups over the baseline optimization by using static function cloning and dynamic adaptation. We present a novel probabilistic approach based on competition among pairs of optimizations (program reaction to optimizations) to enable optimization knowledge reuse and achieve nearly the best possible iterative optimization performance. We implemented our technique in GCC (widespread production open-source compiler that supports multiple architectures) and connected it to a public collective optimization database at cTuning.org to gather profile and optimization data continuously and transparently in realistic environments ranging from desktop PCs and mobile systems to supercomputers and data centers.