The hardware architecture of the CRISP microprocessor
ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
Register windows vs. register allocation
PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Minimizing register usage penalty at procedure calls
PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Compiling C for vectorization, parallelization, and inline expansion
PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Efficient interprocedural analysis for program parallelization and restructuring
PPEALS '88 Proceedings of the ACM/SIGPLAN conference on Parallel programming: experience with applications, languages and systems
Trace selection for compiling large C application programs to microcode
MICRO 21 Proceedings of the 21st annual workshop on Microprogramming and microarchitecture
Code Optimization Across Procedures
Computer
Comparing software and hardware schemes for reducing the cost of branches
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Achieving high instruction cache performance with an optimizing compiler
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
A practical interprocedural data flow analysis algorithm
Communications of the ACM
A program data flow analysis procedure
Communications of the ACM
Register allocation by priority-based coloring
SIGPLAN '84 Proceedings of the 1984 SIGPLAN symposium on Compiler construction
An overview of the PL.8 compiler
SIGPLAN '82 Proceedings of the 1982 SIGPLAN symposium on Compiler construction
A Characterization of Processor Performance in the vax-11/780
ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
Achieving high instruction cache performance with an optimizing compiler
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Procedure merging with instruction caches
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
IMPACT: an architectural framework for multiple-instruction-issue processors
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Comparing static and dynamic code scheduling for multiple-instruction-issue processors
MICRO 24 Proceedings of the 24th annual international symposium on Microarchitecture
Subprogram Inlining: A Study of its Effects on Program Execution Time
IEEE Transactions on Software Engineering
Unexpected side effects of inline substitution: a case study
ACM Letters on Programming Languages and Systems (LOPLAS)
Inlining semantics for subroutines which are recursive
ACM SIGPLAN Notices
Avoiding unconditional jumps by code replication
PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
On the conversion of indirect to direct recursion
ACM Letters on Programming Languages and Systems (LOPLAS)
Optimizing dynamically-dispatched calls with run-time type feedback
PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Expected I-cache miss rates via the gap model
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Using branch handling hardware to support profile-driven optimization
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Improving the accuracy of static branch prediction using branch correlation
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Obtaining sequential efficiency for concurrent object-oriented languages
POPL '95 Proceedings of the 22nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Compiler-Based Multiple Instruction Retry
IEEE Transactions on Computers
A comparative analysis of schemes for correlated branch prediction
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Performance issues in correlated branch prediction schemes
Proceedings of the 28th annual international symposium on Microarchitecture
Region-based compilation: an introduction and motivation
Proceedings of the 28th annual international symposium on Microarchitecture
Reconciling responsiveness with performance in pure object-oriented languages
ACM Transactions on Programming Languages and Systems (TOPLAS)
Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Resource-bounded partial evaluation
PEPM '97 Proceedings of the 1997 ACM SIGPLAN symposium on Partial evaluation and semantics-based program manipulation
IMPACT: an architectural framework for multiple-instruction-issue processors
25 years of the international symposia on Computer architecture (selected papers)
Control flow optimization for supercomputer scalar processing
ICS '89 Proceedings of the 3rd international conference on Supercomputing
Efficient Instruction Sequencing with Inline Target Insertion
IEEE Transactions on Computers
Adaptive online context-sensitive inlining
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Function inlining versus function cloning
ACM SIGPLAN Notices
Function inlining versus function cloning
ACM SIGPLAN Notices
Dynamic run-time architecture techniques for enabling continuous optimization
Proceedings of the 2nd conference on Computing frontiers
EDO: Exception-directed optimization in java
ACM Transactions on Programming Languages and Systems (TOPLAS)
Inline Analysis: Beyond Selection Heuristics
Proceedings of the International Symposium on Code Generation and Optimization
A framework for unrestricted whole-program optimization
Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation
Reaching fast code faster: using modeling for efficient software thread integration on a VLIW DSP
CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
The effect of unrolling and inlining for Python bytecode optimizations
SYSTOR '09 Proceedings of SYSTOR 2009: The Israeli Experimental Systems Conference
Lightweight feedback-directed cross-module optimization
Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization
An adaptive strategy for inline substitution
CC'08/ETAPS'08 Proceedings of the Joint European Conferences on Theory and Practice of Software 17th international conference on Compiler construction
Explicitly heterogeneous metaprogramming with MetaHaskell
Proceedings of the 17th ACM SIGPLAN international conference on Functional programming
Software thread integration for instruction-level parallelism
ACM Transactions on Embedded Computing Systems (TECS)
Hi-index | 0.01 |
Inline function expansion replaces a function call with the function body. With automatic inline function expansion, programs can be constructed with many small functions to handle complexity and then rely on the compilation to eliminate most of the function calls. Therefore, inline expansion serves a tool for satisfying two conflicting goals: minizing the complexity of the program development and minimizing the function call overhead of program execution. A simple inline expansion procedure is presented which uses profile information to address three critical issues: code expansion, stack expansion, and unavailable function bodies. Experiments show that a large percentage of function calls/returns (about 59%) can be eliminated with a modest code expansion cost (about 17%) for twelve UNIX* programs.