Inline Analysis: Beyond Selection Heuristics

Authors:
Dhruva R. Chakrabarti;Shin-Ming Liu
Affiliations:
Hewlett-Packard Company;Hewlett-Packard Company
Venue:
Proceedings of the International Symposium on Code Generation and Optimization
Year:
2006

Citing 21
Cited 0

Interprocedural constant propagation

SIGPLAN '86 Proceedings of the 1986 SIGPLAN symposium on Compiler construction
Inline function expansion for compiling C programs

PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
An experiment with inline substitution

Software—Practice & Experience
Procedure merging with instruction caches

PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Profile-guided automatic inline expansion for C programs

Software—Practice & Experience
Towards better inlining decisions using inlining trials

LFP '94 Proceedings of the 1994 ACM conference on LISP and functional programming
Static branch frequency and program profile analysis

MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Flow-directed inlining

PLDI '96 Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation
Fast static analysis of C++ virtual function calls

Proceedings of the 11th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Aggressive inlining

Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Scalable cross-module optimization

PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
A comparative study of static and profile-based heuristics for inlining

DYNAMO '00 Proceedings of the ACM SIGPLAN workshop on Dynamic and adaptive compilation and optimization
Adaptive optimization in the Jalapeño JVM

OOPSLA '00 Proceedings of the 15th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
An analysis of inline substitution for a structured programming language

Communications of the ACM
An Empirical Study of Method In-lining for a Java Just-in-Time Compiler

Proceedings of the 2nd Java Virtual Machine Research and Technology Symposium
Adaptive online context-sensitive inlining

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Predicting the effects of optimization on a procedure body

SIGPLAN '79 Proceedings of the 1979 SIGPLAN symposium on Compiler construction
Should potential loop optimizations influence inlining decisions?

CASCON '03 Proceedings of the 2003 conference of the Centre for Advanced Studies on Collaborative research
SYZYGY - A Framework for Scalable Cross-Module IPO

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Scalable High Performance Cross-Module Inlining

Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Evaluating inlining techniques

Computer Languages

Quantified Score

Hi-index	0.00

Visualization

Abstract

Research on procedure inlining has mainly focused on heuristics that decide whether inlining a particular call-site maximizes application performance. However, other equally important aspects of inline analysis such as call-site analysis order, indirect effects of inlining, and selection of the most profitable version of a procedure warrant more attention. This paper evaluates a number of different sequences in which call-sites are examined for inlining and shows that choosing the correct order is crucial to obtaining the best run-time performance. We then present a novel, work-list-based, and updated sequence that achieves the best results. While applying cross-module inline analysis on large applications with thousands of files and millions of lines of code, we separate the analysis from the transformation phase and allow the former to work solely on summary information in order to reduce compile-time and memory consumption. A focus of this paper is to enumerate the summaries that our compiler maintains, present a technique to compute the goodness factor on which the work-list sequence is based, and describe methods to continuously update the summaries as and when a call-site is accepted for inlining. We then introduce inline specialization, a new technique that facilitates inlining into call chains selectively. The power of inline specialization lies in its ability to choose the most profitable version of the called procedure without having to maintain multiple versions at any point of time. We discuss implementation of these techniques in the HPUX Itanium production compiler and present experimental results showing that a dynamic work-list based analysis order, comprehensive summary updates, and inline specialization significantly improve performance of applications.