Automatic Feature Generation for Machine Learning Based Optimizing Compilation

Authors:
Hugh Leather;Edwin Bonilla;Michael O'Boyle
Affiliations:
-;-;-
Venue:
Proceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and Optimization
Year:
2009

Citing 19
Cited 12

Evolution and co-evolution of computer programs to control independently-acting agents

Proceedings of the first international conference on simulation of adaptive behavior on From animals to animats
Genetic algorithms and instruction scheduling

MICRO 24 Proceedings of the 24th annual international symposium on Microarchitecture
Wrappers for feature subset selection

Artificial Intelligence - Special issue on relevance
Learning to schedule straight-line code

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Optimizing for reduced code space using genetic algorithms

Proceedings of the ACM SIGPLAN 1999 workshop on Languages, compilers, and tools for embedded systems
Scheduling straight-line code using reinforcement learning and rollouts

Proceedings of the 1998 conference on Advances in neural information processing systems II
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond

Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
Adaptive Optimizing Compilers for the 21st Century

The Journal of Supercomputing
A Machine Learning Approach to Automatic Production of Compiler Heuristics

AIMSA '02 Proceedings of the 10th International Conference on Artificial Intelligence: Methodology, Systems, and Applications
Grammatical Evolution: Evolving Programs for an Arbitrary Language

EuroGP '98 Proceedings of the First European Workshop on Genetic Programming
A comparison of empirical and model-driven optimization

PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Fast searches for effective optimization phase sequences

Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation
Inducing heuristics to decide whether to schedule

Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation
Predicting Unroll Factors Using Supervised Classification

Proceedings of the international symposium on Code generation and optimization
ACME: adaptive compilation made efficient

LCTES '05 Proceedings of the 2005 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Automatic Selection of Compiler Options Using Non-parametric Inferential Statistics

Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Using Machine Learning to Focus Iterative Optimization

Proceedings of the International Symposium on Code Generation and Optimization
Fast and Effective Orchestration of Compiler Optimizations for Automatic Performance Tuning

Proceedings of the International Symposium on Code Generation and Optimization
Method-specific dynamic compilation using logistic regression

Proceedings of the 21st annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and applications

Exploiting statistical correlations for proactive prediction of program behaviors

Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization
Practical aggregation of semantical program properties for machine learning based optimization

CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
Brainy: effective selection of data structures

Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Effective feature set construction for SVM-based hot method prediction and optimisation

International Journal of Computational Science and Engineering
A step towards transparent integration of input-consciousness into dynamic program optimizations

Proceedings of the 2011 ACM international conference on Object oriented programming systems languages and applications
Approximate graph clustering for program characterization

ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
Automatic static feature generation for compiler optimization problems

AI'11 Proceedings of the 24th international conference on Advances in Artificial Intelligence
Using graph-based program characterization for predictive modeling

Proceedings of the Tenth International Symposium on Code Generation and Optimization
Continuous learning of compiler heuristics

ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
Finding good optimization sequences covering program space

ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
AutoTune: a plugin-driven approach to the automatic tuning of parallel applications

PARA'12 Proceedings of the 11th international conference on Applied Parallel and Scientific Computing
Automatic feature generation for machine learning--based optimising compilation

ACM Transactions on Architecture and Code Optimization (TACO)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recent work has shown that machine learning can automate and in some cases outperform hand crafted compiler optimizations. Central to such an approach is that machine learning techniques typically rely upon summaries or features of the program. The quality of these features is critical to the accuracy of the resulting machine learned algorithm; no machine learning method will work well with poorly chosen features. However, due to the size and complexity of programs, theoretically there are an infinite number of potential features to choose from. The compiler writer now has to expend effort in choosing the best features from this space. This paper develops a novel mechanism to automatically find those features which most improve the quality of the machine learned heuristic. The feature space is described by a grammar and is then searched with genetic programming and predictive modeling. We apply this technique to loop unrolling in GCC 4.3.1 and evaluate our approach on a Pentium 6. On a benchmark suite of 57 programs, GCC's hard-coded heuristic achieves only 3% of the maximum performance available, while a state of the art machine learning approach with hand-coded features obtains 59%. Our feature generation technique is able to achieve 76% of the maximum available speedup, outperforming existing approaches.