Inline function expansion for compiling C programs

Authors:
P. P. Chang;W.-W. Hwu
Affiliations:
Coordinated Science Laboratory, University of Illinois, 1101 W. Springfield Ave., Urbana, IL;Coordinated Science Laboratory, University of Illinois, 1101 W. Springfield Ave., Urbana, IL
Venue:
PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
Year:
1989

Citing 14
Cited 37

The hardware architecture of the CRISP microprocessor

ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
Register windows vs. register allocation

PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Minimizing register usage penalty at procedure calls

PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Compiling C for vectorization, parallelization, and inline expansion

PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Efficient interprocedural analysis for program parallelization and restructuring

PPEALS '88 Proceedings of the ACM/SIGPLAN conference on Parallel programming: experience with applications, languages and systems
Trace selection for compiling large C application programs to microcode

MICRO 21 Proceedings of the 21st annual workshop on Microprogramming and microarchitecture
Code Optimization Across Procedures

Computer
Comparing software and hardware schemes for reducing the cost of branches

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Achieving high instruction cache performance with an optimizing compiler

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
A practical interprocedural data flow analysis algorithm

Communications of the ACM
A program data flow analysis procedure

Communications of the ACM
Register allocation by priority-based coloring

SIGPLAN '84 Proceedings of the 1984 SIGPLAN symposium on Compiler construction
An overview of the PL.8 compiler

SIGPLAN '82 Proceedings of the 1982 SIGPLAN symposium on Compiler construction
A Characterization of Processor Performance in the vax-11/780

ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture

Achieving high instruction cache performance with an optimizing compiler

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Procedure merging with instruction caches

PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
IMPACT: an architectural framework for multiple-instruction-issue processors

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Comparing static and dynamic code scheduling for multiple-instruction-issue processors

MICRO 24 Proceedings of the 24th annual international symposium on Microarchitecture
Subprogram Inlining: A Study of its Effects on Program Execution Time

IEEE Transactions on Software Engineering
Unexpected side effects of inline substitution: a case study

ACM Letters on Programming Languages and Systems (LOPLAS)
Inlining semantics for subroutines which are recursive

ACM SIGPLAN Notices
Avoiding unconditional jumps by code replication

PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
On the conversion of indirect to direct recursion

ACM Letters on Programming Languages and Systems (LOPLAS)
Optimizing dynamically-dispatched calls with run-time type feedback

PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Expected I-cache miss rates via the gap model

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Using branch handling hardware to support profile-driven optimization

MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Improving the accuracy of static branch prediction using branch correlation

ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Obtaining sequential efficiency for concurrent object-oriented languages

POPL '95 Proceedings of the 22nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Compiler-Based Multiple Instruction Retry

IEEE Transactions on Computers
A comparative analysis of schemes for correlated branch prediction

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Performance issues in correlated branch prediction schemes

Proceedings of the 28th annual international symposium on Microarchitecture
Region-based compilation: an introduction and motivation

Proceedings of the 28th annual international symposium on Microarchitecture
Reconciling responsiveness with performance in pure object-oriented languages

ACM Transactions on Programming Languages and Systems (TOPLAS)
Aggressive inlining

Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Resource-bounded partial evaluation

PEPM '97 Proceedings of the 1997 ACM SIGPLAN symposium on Partial evaluation and semantics-based program manipulation
IMPACT: an architectural framework for multiple-instruction-issue processors

25 years of the international symposia on Computer architecture (selected papers)
Control flow optimization for supercomputer scalar processing

ICS '89 Proceedings of the 3rd international conference on Supercomputing
Efficient Instruction Sequencing with Inline Target Insertion

IEEE Transactions on Computers
Adaptive online context-sensitive inlining

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Function inlining versus function cloning

ACM SIGPLAN Notices
Function inlining versus function cloning

ACM SIGPLAN Notices
Dynamic run-time architecture techniques for enabling continuous optimization

Proceedings of the 2nd conference on Computing frontiers
EDO: Exception-directed optimization in java

ACM Transactions on Programming Languages and Systems (TOPLAS)
Inline Analysis: Beyond Selection Heuristics

Proceedings of the International Symposium on Code Generation and Optimization
A framework for unrestricted whole-program optimization

Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation
Reaching fast code faster: using modeling for efficient software thread integration on a VLIW DSP

CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
The effect of unrolling and inlining for Python bytecode optimizations

SYSTOR '09 Proceedings of SYSTOR 2009: The Israeli Experimental Systems Conference
Lightweight feedback-directed cross-module optimization

Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization
An adaptive strategy for inline substitution

CC'08/ETAPS'08 Proceedings of the Joint European Conferences on Theory and Practice of Software 17th international conference on Compiler construction
Explicitly heterogeneous metaprogramming with MetaHaskell

Proceedings of the 17th ACM SIGPLAN international conference on Functional programming
Software thread integration for instruction-level parallelism

ACM Transactions on Embedded Computing Systems (TECS)

Quantified Score

Hi-index	0.01

Visualization

Abstract

Inline function expansion replaces a function call with the function body. With automatic inline function expansion, programs can be constructed with many small functions to handle complexity and then rely on the compilation to eliminate most of the function calls. Therefore, inline expansion serves a tool for satisfying two conflicting goals: minizing the complexity of the program development and minimizing the function call overhead of program execution. A simple inline expansion procedure is presented which uses profile information to address three critical issues: code expansion, stack expansion, and unavailable function bodies. Experiments show that a large percentage of function calls/returns (about 59%) can be eliminated with a modest code expansion cost (about 17%) for twelve UNIX* programs.