Global value numbers and redundant computations
POPL '88 Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
The value flow graph: a program representation for optimal program transformations
Proceedings of the third European symposium on programming on ESOP '90
Improving register allocation for subscripted variables
PLDI '90 Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation
Efficiently computing static single assignment form and the control dependence graph
ACM Transactions on Programming Languages and Systems (TOPLAS)
IMPACT: an architectural framework for multiple-instruction-issue processors
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
A practical data flow framework for array reference analysis and its use in optimizations
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
The superblock: an effective technique for VLIW and superscalar compilation
The Journal of Supercomputing - Special issue on instruction-level parallelism
Effective partial redundancy elimination
PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Scalar replacement in the presence of conditional control flow
Software—Practice & Experience
Optimal code motion: theory and practice
ACM Transactions on Programming Languages and Systems (TOPLAS)
The undecidability of aliasing
ACM Transactions on Programming Languages and Systems (TOPLAS)
Improving the accuracy of static branch prediction using branch correlation
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Compiler optimizations for improving data locality
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Improving the ratio of memory operations to floating-point operations in loops
ACM Transactions on Programming Languages and Systems (TOPLAS)
Avoiding conditional branches by code replication
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Proceedings of the tenth annual conference on Object-oriented programming systems, languages, and applications
PLDI '96 Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Array data flow analysis for load-store optimizations in fine-grain architectures
International Journal of Parallel Programming - Special issue: selected papers from the eighth international workshop on languages and compilers for parallel computing
Interprocedural conditional branch elimination
Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Partial dead code elimination using slicing transformations
Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Register promotion in C programs
Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Resource-sensitive profile-directed data flow analysis for code optimization
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Refining data flow information using infeasible paths
ESEC '97/FSE-5 Proceedings of the 6th European SOFTWARE ENGINEERING conference held jointly with the 5th ACM SIGSOFT international symposium on Foundations of software engineering
Edge profiling versus path profiling: the showdown
POPL '98 Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Path-sensitive value-flow analysis
POPL '98 Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Complete removal of redundant expressions
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
A new algorithm for scalar register promotion based on SSA form
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Register promotion by sparse partial redundancy elimination of loads and stores
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Global optimization by suppression of partial redundancies
Communications of the ACM
Type Directed Cloning for Object-Oriented Programs
LCPC '95 Proceedings of the 8th International Workshop on Languages and Compilers for Parallel Computing
SAS '96 Proceedings of the Third International Symposium on Static Analysis
Path-sensitive, value-flow optimizations of programs (program analysis)
Path-sensitive, value-flow optimizations of programs (program analysis)
Modular interprocedural pointer analysis using access paths: design, implementation, and evaluation
PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
ABCD: eliminating array bounds checks on demand
PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Load and store reuse using register file contents
ICS '01 Proceedings of the 15th international conference on Supercomputing
Timestamped whole program path representation and its applications
Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation
An efficient static analysis algorithm to detect redundant memory operations
Proceedings of the 2002 workshop on Memory system performance
A compiler framework for speculative analysis and optimizations
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Load Redundancy Removal through Instruction Reuse
ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
Reducing data cache energy consumption via cached load/store queue
Proceedings of the 2003 international symposium on Low power electronics and design
An experimental evaluation of scalar replacement on scientific benchmarks
Software—Practice & Experience
Extending Path Profiling across Loop Backedges and Procedure Boundaries
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
ACM SIGPLAN Notices - Best of PLDI 1979-1999
Complete removal of redundant expressions
ACM SIGPLAN Notices - Best of PLDI 1979-1999
Applications of storage mapping optimization to register promotion
Proceedings of the 18th annual international conference on Supercomputing
A compiler framework for speculative optimizations
ACM Transactions on Architecture and Code Optimization (TACO)
Partial redundancy elimination for access expressions by speculative code motion
Software—Practice & Experience
Interprocedural Speculative Optimization of Memory Accesses to Global Variables
Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
Redundancy elimination revisited
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Limits for a feasible speculative trace reuse implementation
International Journal of High Performance Systems Architecture
Local redundant polymorphism query elimination
Proceedings of the 8th International Conference on the Principles and Practice of Programming in Java
Dynamic method to evaluate code optimization effectiveness
Proceedings of the 15th International Workshop on Software and Compilers for Embedded Systems
Ball-Larus path profiling across multiple loop iterations
Proceedings of the 2013 ACM SIGPLAN international conference on Object oriented programming systems languages & applications
Interprocedural strength reduction of critical sections in explicitly-parallel programs
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Hi-index | 0.00 |
Load-reuse analysis finds instructions that repeatedly access the same memory location. This location can be promoted to a register, eliminating redundant loads by reusing the results of prior memory accesses. This paper develops a load-reuse analysis and designs a method for evaluating its precision.In designing the analysis, we aspire for completeness---the goal of exposing all reuse that can be harvested by a subsequent program transformation. For register promotion, a suitable transformation is partial redundancy elimination (PRE). To approach the ideal goal of PRE-completeness, the load-reuse analysis is phrased as a data-flow problem on a program representation that is path-sensitive, as it detects reuse even when it originates in a different instruction along each control flow path. Furthermore, the analysis is comprehensive, as it treats scalar, array and pointer-based loads uniformly.In evaluating the analysis, we compare it with an ideal analysis. By observing the run-time stream of memory references, we collect all PRE-exploitable reuse and treat it as the ideal analysis performance. To compare the (static) load-reuse analysis with the (dynamic) ideal reuse, we use an estimator algorithm that computes, given a data-flow solution and a program profile, the dynamic amount of reuse detected by the analysis. We developed a family of estimators that differ in how well they bound the profiling error inherent in the edge profile. By bounding the error, the estimators offer a precise and practical method for determining the run-time optimization benefit.Our experiments show that about 55% of loads executed in Spec95 exhibit reuse. Of those, our analysis exposes about 80%.