Neural networks for pattern recognition
Neural networks for pattern recognition
ProfileMe: hardware support for instruction-level profiling on out-of-order processors
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Optimizing for reduced code space using genetic algorithms
Proceedings of the ACM SIGPLAN 1999 workshop on Languages, compilers, and tools for embedded systems
Automatically tuned linear algebra software
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Theory of Information and Coding
Theory of Information and Coding
Adaptive Optimizing Compilers for the 21st Century
The Journal of Supercomputing
A Machine Learning Approach to Automatic Production of Compiler Heuristics
AIMSA '02 Proceedings of the 10th International Conference on Artificial Intelligence: Methodology, Systems, and Applications
Compiler optimization-space exploration
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
A comparison of empirical and model-driven optimization
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Meta optimization: improving compiler heuristics with machine learning
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Combined Selection of Tile Sizes and Unroll Factors Using Iterative Compilation
PACT '00 Proceedings of the 2000 International Conference on Parallel Architectures and Compilation Techniques
Fast searches for effective optimization phase sequences
Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation
Inducing heuristics to decide whether to schedule
Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation
Predicting Unroll Factors Using Supervised Classification
Proceedings of the international symposium on Code generation and optimization
A Model-Based Framework: An Approach for Profit-Driven Optimization
Proceedings of the international symposium on Code generation and optimization
Proceedings of the 2004 ACM/IEEE conference on Supercomputing
ACME: adaptive compilation made efficient
LCTES '05 Proceedings of the 2005 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Probabilistic source-level optimisation of embedded programs
LCTES '05 Proceedings of the 2005 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Statistical Models for Empirical Search-Based Performance Tuning
International Journal of High Performance Computing Applications
Online performance analysis by statistical sampling of microprocessor performance counters
Proceedings of the 19th annual international conference on Supercomputing
Think globally, search locally
Proceedings of the 19th annual international conference on Supercomputing
Automatic Selection of Compiler Options Using Non-parametric Inferential Statistics
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Using Machine Learning to Focus Iterative Optimization
Proceedings of the International Symposium on Code Generation and Optimization
Exhaustive Optimization Phase Order Space Exploration
Proceedings of the International Symposium on Code Generation and Optimization
Fast and Effective Orchestration of Compiler Optimizations for Automatic Performance Tuning
Proceedings of the International Symposium on Code Generation and Optimization
MiBench: A free, commercially representative embedded benchmark suite
WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
Fast, automatic, procedure-level performance tuning
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Method-specific dynamic compilation using logistic regression
Proceedings of the 21st annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and applications
Phase-based adaptive recompilation in a JVM
Proceedings of the 6th annual IEEE/ACM international symposium on Code generation and optimization
Cole: compiler optimization level exploration
Proceedings of the 6th annual IEEE/ACM international symposium on Code generation and optimization
Exploring and predicting the architecture/optimising compiler co-design space
CASES '08 Proceedings of the 2008 international conference on Compilers, architectures and synthesis for embedded systems
Mapping parallelism to multi-cores: a machine learning based approach
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
HiPEAC '09 Proceedings of the 4th International Conference on High Performance Embedded Architectures and Compilers
Scenario Based Optimization: A Framework for Statically Enabling Online Optimizations
Proceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and Optimization
OptiScope: Performance Accountability for Optimizing Compilers
Proceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and Optimization
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Variant-based competitive parallel execution of sequential programs
Proceedings of the 7th ACM international conference on Computing frontiers
Evaluating iterative optimization across 1000 datasets
PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
Proceedings of the 13th International Workshop on Software & Compilers for Embedded Systems
A case for machine learning to optimize multicore performance
HotPar'09 Proceedings of the First USENIX conference on Hot topics in parallelism
Practical aggregation of semantical program properties for machine learning based optimization
CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
Collective optimization: A practical collaborative approach
ACM Transactions on Architecture and Code Optimization (TACO)
Support of Android lab modules for embedded system curriculum
WESE '10 Proceedings of the 2010 Workshop on Embedded Systems Education
Efficiently exploring compiler optimization sequences with pairwise pruning
Proceedings of the 1st International Workshop on Adaptive Self-Tuning Computing Systems for the Exaflop Era
Loaf: a framework and infrastructure for creating online adaptive solutions
Proceedings of the 1st International Workshop on Adaptive Self-Tuning Computing Systems for the Exaflop Era
Understanding stencil code performance on multicore architectures
Proceedings of the 8th ACM International Conference on Computing Frontiers
An evaluation of different modeling techniques for iterative compilation
CASES '11 Proceedings of the 14th international conference on Compilers, architectures and synthesis for embedded systems
Using machine learning to improve automatic vectorization
ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
Iterative optimization for the data center
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
Reducing training time in a one-shot machine learning-based compiler
LCPC'09 Proceedings of the 22nd international conference on Languages and Compilers for Parallel Computing
ACM Transactions on Embedded Computing Systems (TECS)
Predictive modeling in a polyhedral optimization space
CGO '11 Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization
Using machines to learn method-specific compilation strategies
CGO '11 Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization
Selective search of inlining vectors for program optimization
Proceedings of the 9th conference on Computing Frontiers
Using graph-based program characterization for predictive modeling
Proceedings of the Tenth International Symposium on Code Generation and Optimization
Parallel iterative compilation: using MapReduce to speedup machine learning in compilers
Proceedings of third international workshop on MapReduce and its Applications Date
THeME: a system for testing by hardware monitoring events
Proceedings of the 2012 International Symposium on Software Testing and Analysis
Deconstructing iterative optimization
ACM Transactions on Architecture and Code Optimization (TACO)
Continuous learning of compiler heuristics
ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
Finding good optimization sequences covering program space
ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
On the determination of inlining vectors for program optimization
CC'13 Proceedings of the 22nd international conference on Compiler Construction
Hybrid type legalization for a sparse SIMD instruction set
ACM Transactions on Architecture and Code Optimization (TACO)
Tools for machine-learning-based empirical autotuning and specialization
International Journal of High Performance Computing Applications
Hi-index | 0.00 |
Applying the right compiler optimizations to a particular program can have a significant impact on program performance. Due to the non-linear interaction of compiler optimizations, however, determining the best setting is nontrivial. There have been several proposed techniques that search the space of compiler options to find good solutions; however such approaches can be expensive. This paper proposes a different approach using performance counters as a means of determining good compiler optimization settings. This is achieved by learning a model off-line which can then be used to determine good settings for any new program. We show that such an approach outperforms the state-ofthe- art and is two orders of magnitude faster on average. Furthermore, we show that our performance counter-based approach outperforms techniques based on static code features. Using our technique we achieve a 17% improvement over the highest optimization setting of the commercial PathScale EKOPath 2.3.1 optimizing compiler on the SPEC benchmark suite on a recent AMD Athlon 64 3700+ platform.