Iterative optimization for the data center

Authors:
Yang Chen;Shuangde Fang;Lieven Eeckhout;Olivier Temam;Chengyong Wu
Affiliations:
Institute of Computing Technology and Graduate school, Chinese Academy of Sciences, Beijing, China;nstitute of Computing Technology and Graduate school, Chinese Academy of Sciences, Beijing, China;Ghent University, Ghent, Belgium;INRIA, Saclay, Paris, France;Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Venue:
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
Year:
2012

Citing 25
Cited 2

Optimizing for reduced code space using genetic algorithms

Proceedings of the ACM SIGPLAN 1999 workshop on Languages, compilers, and tools for embedded systems
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Meta optimization: improving compiler heuristics with machine learning

PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
ADAPT: Automated De-Coupled Adaptive Program Transformation

ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
Fast searches for effective optimization phase sequences

Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation
Predicting Unroll Factors Using Supervised Classification

Proceedings of the international symposium on Code generation and optimization
ACME: adaptive compilation made efficient

LCTES '05 Proceedings of the 2005 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Probabilistic source-level optimisation of embedded programs

LCTES '05 Proceedings of the 2005 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Using Machine Learning to Focus Iterative Optimization

Proceedings of the International Symposium on Code Generation and Optimization
Fast and Effective Orchestration of Compiler Optimizations for Automatic Performance Tuning

Proceedings of the International Symposium on Code Generation and Optimization
Fast, automatic, procedure-level performance tuning

Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Google news personalization: scalable online collaborative filtering

Proceedings of the 16th international conference on World Wide Web
Power provisioning for a warehouse-sized computer

Proceedings of the 34th annual international symposium on Computer architecture
MapReduce: simplified data processing on large clusters

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Rapidly Selecting Good Compiler Optimizations using Performance Counters

Proceedings of the International Symposium on Code Generation and Optimization
Automating the construction of compiler heuristics using machine learning

Automating the construction of compiler heuristics using machine learning
Evaluating MapReduce for Multi-core and Multiprocessor Systems

HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
The PARSEC benchmark suite: characterization and architectural implications

Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Collective Optimization

HiPEAC '09 Proceedings of the 4th International Conference on High Performance Embedded Architectures and Compilers
Scaling Genetic Algorithms Using MapReduce

ISDA '09 Proceedings of the 2009 Ninth International Conference on Intelligent Systems Design and Applications
MiDataSets: creating the conditions for a more realistic evaluation of Iterative optimization

HiPEAC'07 Proceedings of the 2nd international conference on High performance embedded architectures and compilers
Evaluating iterative optimization across 1000 datasets

PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
Using Resampling Techniques to Compute Confidence Intervals for the Harmonic Mean of Rate-Based Performance Metrics

IEEE Computer Architecture Letters
On the impact of data input sets on statistical compiler tuning

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
A practical method for quickly evaluating program optimizations

HiPEAC'05 Proceedings of the First international conference on High Performance Embedded Architectures and Compilers

Revisiting memory management on virtualized environments

ACM Transactions on Architecture and Code Optimization (TACO)
JIT technology with C/C++: Feedback-directed dynamic recompilation for statically compiled languages

ACM Transactions on Architecture and Code Optimization (TACO)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Iterative optimization is a simple but powerful approach that searches for the best possible combination of compiler optimizations for a given workload. However, each program, if not each data set, potentially favors a different combination. As a result, iterative optimization is plagued by several practical issues that prevent it from being widely used in practice: a large number of runs are required for finding the best combination; the process can be data set dependent; and the exploration process incurs significant overhead that needs to be compensated for by performance benefits.Therefore, while iterative optimization has been shown to have significant performance potential, it is seldomly used in production compilers. In this paper, we propose Iterative Optimization for the Data Center (IODC): we show that servers and data centers offer a context in which all of the above hurdles can be overcome. The basic idea is to spawn different combinations across workers and recollect performance statistics at the master, which then evolves to the optimum combination of compiler optimizations. IODC carefully manages costs and benefits, and is transparent to the end user. We evaluate IODC using both MapReduce and throughput compute-intensive server applications. In order to reflect the large number of users interacting with the system, we gather a very large collection of data sets (at least 1000 and up to several million unique data sets per program), for a total storage of 10.7TB, and 568 days of CPU time. We report an average performance improvement of 1.48×, and up to 2.08×, for the MapReduce applications, and 1.14×, and up to 1.39×, for the throughput compute-intensive server applications.