Program optimization for instruction caches
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Determining average program execution times and their variance
PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
Register allocation across procedure and module boundaries
PLDI '90 Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation
Predicting program behavior using real or estimated profiles
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Register allocation via hierarchical graph coloring
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Using profile information to assist classic code optimizations
Software—Practice & Experience
Profile-guided automatic inline expansion for C programs
Software—Practice & Experience
Predicting conditional branch directions from previous runs of a program
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Using lifetime predictors to improve memory allocation performance
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Effective partial redundancy elimination
PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Analysis of computational systems: Discrete Markov analysis of computer programs
ACM '65 Proceedings of the 1965 20th national conference
Static branch frequency and program profile analysis
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Using branch handling hardware to support profile-driven optimization
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Obtaining sequential efficiency for concurrent object-oriented languages
POPL '95 Proceedings of the 22nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Accurate static branch prediction by value range propagation
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Corpus-based static branch prediction
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Abstract interpretation and low-level code optimization
PEPM '95 Proceedings of the 1995 ACM SIGPLAN symposium on Partial evaluation and semantics-based program manipulation
The predictability of branches in libraries
Proceedings of the 28th annual international symposium on Microarchitecture
PLDI '96 Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation
PLDI '96 Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation
Evidence-based static branch prediction using machine learning
ACM Transactions on Programming Languages and Systems (TOPLAS)
Analysis and evaluation of address arithmetic capabilities in custom DSP architectures
DAC '97 Proceedings of the 34th annual Design Automation Conference
Edge profiling versus path profiling: the showdown
POPL '98 Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Segregating heap objects by reference behavior and lifetime
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Task response time optimization using cost-based operation motion
CODES '00 Proceedings of the eighth international workshop on Hardware/software codesign
Static performance prediction of data-dependent programs
Proceedings of the 2nd international workshop on Software and performance
A framework for performance-based program partitioning
Progress in computer research
Effective sign extension elimination
PLDI '02 Proceedings of the ACM SIGPLAN 2002 Conference on Programming language design and implementation
A framework for performance-based program partitioning
Progress in computer research
Online feedback-directed optimization of Java
OOPSLA '02 Proceedings of the 17th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Compiler support for speculative multithreading architecture with probabilistic points-to analysis
Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
The Accuracy of Initial Prediction in Two-Phase Dynamic Binary Translators
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Interprocedural Probabilistic Pointer Analysis
IEEE Transactions on Parallel and Distributed Systems
Partial redundancy elimination for access expressions by speculative code motion
Software—Practice & Experience
Application of redundant computation in software performance analysis
Proceedings of the 5th international workshop on Software and performance
Lightweight reference affinity analysis
Proceedings of the 19th annual international conference on Supercomputing
Effective sign extension elimination for java
ACM Transactions on Programming Languages and Systems (TOPLAS)
In search of near-optimal optimization phase orderings
Proceedings of the 2006 ACM SIGPLAN/SIGBED conference on Language, compilers, and tool support for embedded systems
A compiler cost model for speculative parallelization
ACM Transactions on Architecture and Code Optimization (TACO)
Copy coalescing by graph recoloring
Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
A Comparative Study of Industrial Static Analysis Tools
Electronic Notes in Theoretical Computer Science (ENTCS)
Practical exhaustive optimization phase order exploration and evaluation
ACM Transactions on Architecture and Code Optimization (TACO)
MEMOCODE'09 Proceedings of the 7th IEEE/ACM international conference on Formal Methods and Models for Codesign
Probabilistic points-to analysis
LCPC'01 Proceedings of the 14th international conference on Languages and compilers for parallel computing
A systematic approach to probabilistic pointer analysis
APLAS'07 Proceedings of the 5th Asian conference on Programming languages and systems
Frequency estimation of virtual call targets for object-oriented programs
Proceedings of the 25th European conference on Object-oriented programming
Branch penalty reduction on IBM cell SPUs via software branch hinting
CODES+ISSS '11 Proceedings of the seventh IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Preference-Guided register assignment
CC'10/ETAPS'10 Proceedings of the 19th joint European conference on Theory and Practice of Software, international conference on Compiler Construction
Hi-index | 0.00 |
Determining the relative execution frequency of program regions is essential for many important optimization techniques, including register allocation, function inlining, and instruction scheduling. Estimates derived from profiling with sample inputs are generally regarded as the most accurate source of this information; static (compile-time) estimates are considered to be distinctly inferior. If static estimates were shown to be competitive, however, their convenience would outweigh minor gains from profiling, and they would provide a sound basis for optimization when profiling is impossible.We use quantitative metrics to compare estimates from static analysis to those derived from profiles. For C programs, simple techniques for predicting branches and loop counts suffice to estimate intraprocedural frequency patterns with high accuracy. To determine inter-procedural estimates successfully, we combine function-level information with a Markov model of control flow over the call graph to produce arc and basic block frequency estimates for the entire program.For a suite of 14 programs, including the C programs from the SPEC92 benchmark suite, we demonstrate that static estimates are competitive with those derived from profiles. Using simple heuristics, we can determine the most frequently executed blocks in each function with 81% accuracy. With the Markov model, we identify 80% of the frequently called functions. Combining the two techniques, we identify 76% of the most frequently executed call sites.