Optimizing for reduced code space using genetic algorithms
Proceedings of the ACM SIGPLAN 1999 workshop on Languages, compilers, and tools for embedded systems
Continuous program optimization: A case study
ACM Transactions on Programming Languages and Systems (TOPLAS)
Meta optimization: improving compiler heuristics with machine learning
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Combined Selection of Tile Sizes and Unroll Factors Using Iterative Compilation
PACT '00 Proceedings of the 2000 International Conference on Parallel Architectures and Compilation Techniques
LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Fast searches for effective optimization phase sequences
Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation
ACME: adaptive compilation made efficient
LCTES '05 Proceedings of the 2005 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Probabilistic source-level optimisation of embedded programs
LCTES '05 Proceedings of the 2005 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Optimizing instruction cache performance of embedded systems
ACM Transactions on Embedded Computing Systems (TECS)
MinneSPEC: A New SPEC Benchmark Workload for Simulation-Based Computer Architecture Research
IEEE Computer Architecture Letters
Using Machine Learning to Focus Iterative Optimization
Proceedings of the International Symposium on Code Generation and Optimization
MiBench: A free, commercially representative embedded benchmark suite
WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
A practical method for quickly evaluating program optimizations
HiPEAC'05 Proceedings of the First international conference on High Performance Embedded Architectures and Compilers
Minimal placement of bank selection instructions for partitioned memory architectures
ACM Transactions on Embedded Computing Systems (TECS)
An analytical model for the upper bound on temperature differences on a chip
Proceedings of the 18th ACM Great Lakes symposium on VLSI
Cache modeling in probabilistic execution time analysis
Proceedings of the 45th annual Design Automation Conference
HiPEAC '09 Proceedings of the 4th International Conference on High Performance Embedded Architectures and Compilers
Workload Reduction for Multi-input Feedback-Directed Optimization
Proceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and Optimization
Evaluation of Multicore Processors for Embedded Systems by Parallel Benchmark Program Using OpenMP
IWOMP '09 Proceedings of the 5th International Workshop on OpenMP: Evolving OpenMP in an Age of Extreme Parallelism
Variant-based competitive parallel execution of sequential programs
Proceedings of the 7th ACM international conference on Computing frontiers
Evaluating iterative optimization across 1000 datasets
PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
A profile-based tool for finding pipeline parallelism in sequential programs
Parallel Computing
Accurate direct and indirect on-chip temperature sensing for efficient dynamic thermal management
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems - Special section on the ACM IEEE international conference on formal methods and models for codesign (MEMOCODE) 2009
Collective optimization: A practical collaborative approach
ACM Transactions on Architecture and Code Optimization (TACO)
ReMAP: A Reconfigurable Heterogeneous Multicore Architecture
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Automatic estimation of performance requirements for software tasks of mobile devices
Proceedings of the 2nd ACM/SPEC International Conference on Performance engineering
Iterative optimization for the data center
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
Deconstructing iterative optimization
ACM Transactions on Architecture and Code Optimization (TACO)
Continuous learning of compiler heuristics
ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
TempoMP: integrated prediction and management of temperature in heterogeneous MPSoCs
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
Exploiting GPU Hardware Saturation for Fast Compiler Optimization
Proceedings of Workshop on General Purpose Processing Using GPUs
Hi-index | 0.00 |
Iterative optimization has become a popular technique to obtain improvements over the default settings in a compiler for performance-critical applications, such as embedded applications. An implicit assumption, however, is that the best configuration found for any arbitrary data set will work well with other data sets that a program uses. In this article, we evaluate that assumption based on 20 data sets per benchmark of the MiBench suite. We find that, though a majority of programs exhibit stable performance across data sets, the variability can significantly increase with many optimizations. However, for the best optimization configurations, we find that this variability is in fact small. Furthermore, we show that it is possible to find a compromise optimization configuration across data sets which is often within 5% of the best possible configuration for most data sets, and that the iterative process can converge in less than 20 iterations (for a population of 200 optimization configurations). All these conclusions have significant and positive implications for the practical utilization of iterative optimization.