Evaluating iterative optimization across 1000 datasets

Authors:
Yang Chen;Yuanjie Huang;Lieven Eeckhout;Grigori Fursin;Liang Peng;Olivier Temam;Chengyong Wu
Affiliations:
Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China;Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China;Ghent University, Gent, Belgium;National Institute for Research in Computer and Control Sciences (INRIA), Saclay, France;Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China;National Institute for Research in Computer and Control Sciences (INRIA), Saclay, France;Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Venue:
PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
Year:
2010

Citing 23
Cited 6

Optimizing for reduced code space using genetic algorithms

Proceedings of the ACM SIGPLAN 1999 workshop on Languages, compilers, and tools for embedded systems
Meta optimization: improving compiler heuristics with machine learning

PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
On the Predictability of Program Behavior Using Different Input Data Sets

INTERACT '02 Proceedings of the Sixth Annual Workshop on Interaction between Compilers and Computer Architectures
Profile-based dynamic voltage and frequency scaling for a multiple clock domain microprocessor

Proceedings of the 30th annual international symposium on Computer architecture
Fast searches for effective optimization phase sequences

Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation
Profile-based adaptation for cache decay

ACM Transactions on Architecture and Code Optimization (TACO)
ACME: adaptive compilation made efficient

LCTES '05 Proceedings of the 2005 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Probabilistic source-level optimisation of embedded programs

LCTES '05 Proceedings of the 2005 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Improving virtual machine performance using a cross-run profile repository

OOPSLA '05 Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Using Machine Learning to Focus Iterative Optimization

Proceedings of the International Symposium on Code Generation and Optimization
Fast and Effective Orchestration of Compiler Optimizations for Automatic Performance Tuning

Proceedings of the International Symposium on Code Generation and Optimization
MiBench: A free, commercially representative embedded benchmark suite

WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
Rapidly Selecting Good Compiler Optimizations using Performance Counters

Proceedings of the International Symposium on Code Generation and Optimization
Automating the construction of compiler heuristics using machine learning

Automating the construction of compiler heuristics using machine learning
Cole: compiler optimization level exploration

Proceedings of the 6th annual IEEE/ACM international symposium on Code generation and optimization
The PARSEC benchmark suite: characterization and architectural implications

Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Producing wrong data without doing anything obviously wrong!

Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Influence of program inputs on the selection of garbage collectors

Proceedings of the 2009 ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Program locality analysis using reuse distance

ACM Transactions on Programming Languages and Systems (TOPLAS)
MiDataSets: creating the conditions for a more realistic evaluation of Iterative optimization

HiPEAC'07 Proceedings of the 2nd international conference on High performance embedded architectures and compilers
Automated just-in-time compiler tuning

Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization
Exploiting statistical correlations for proactive prediction of program behaviors

Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization
On the impact of data input sets on statistical compiler tuning

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing

Iterative optimization for the data center

ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
Performance optimization on a supercomputer with cTuning and the PGI compiler

Proceedings of the 2nd International Workshop on Adaptive Self-Tuning Computing Systems for the Exaflop Era
Deconstructing iterative optimization

ACM Transactions on Architecture and Code Optimization (TACO)
Finding good optimization sequences covering program space

ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
Selecting representative benchmark inputs for exploring microprocessor design spaces

ACM Transactions on Architecture and Code Optimization (TACO)
Exploiting GPU Hardware Saturation for Fast Compiler Optimization

Proceedings of Workshop on General Purpose Processing Using GPUs

Quantified Score

Hi-index	0.00

Visualization

Abstract

While iterative optimization has become a popular compiler optimization approach, it is based on a premise which has never been truly evaluated: that it is possible to learn the best compiler optimizations across data sets. Up to now, most iterative optimization studies find the best optimizations through repeated runs on the same data set. Only a handful of studies have attempted to exercise iterative optimization on a few tens of data sets. In this paper, we truly put iterative compilation to the test for the first time by evaluating its effectiveness across a large number of data sets. We therefore compose KDataSets, a data set suite with 1000 data sets for 32 programs, which we release to the public. We characterize the diversity of KDataSets, and subsequently use it to evaluate iterative optimization.We demonstrate that it is possible to derive a robust iterative optimization strategy across data sets: for all 32 programs, we find that there exists at least one combination of compiler optimizations that achieves 86% or more of the best possible speedup across all data sets using Intel's ICC (83% for GNU's GCC). This optimal combination is program-specific and yields speedups up to 1.71 on ICC and 2.23 on GCC over the highest optimization level (-fast and -O3, respectively). This finding makes the task of optimizing programs across data sets much easier than previously anticipated, and it paves the way for the practical and reliable usage of iterative optimization. Finally, we derive pre-shipping and post-shipping optimization strategies for software vendors.