A lifetime optimal algorithm for speculative PRE

  • Authors:
  • Jingling Xue;Qiong Cai

  • Affiliations:
  • University of New South Wales, Sydney, NSW, Australia;University of New South Wales, Sydney, NSW, Australia

  • Venue:
  • ACM Transactions on Architecture and Code Optimization (TACO)
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

A lifetime optimal algorithm, called MC-PRE, is presented for the first time that performs speculative PRE based on edge profiles. In addition to being computationally optimal in the sense that the total number of dynamic computations for an expression in the transformed code is minimized, MC-PRE is also lifetime optimal since the lifetimes of introduced temporaries are also minimized. The key in achieving lifetime optimality lies not only in finding a unique minimum cut on a transformed graph of a given CFG, but also in performing a data-flow analysis directly on the CFG to avoid making unnecessary code insertions and deletions. The lifetime optimal results are rigorously proved. We evaluate our algorithm in GCC against three previously published PRE algorithms, namely, MC-PREcopt (Qiong and Xue's computationally optimal version of MC-PRE), LCM (Knoop, Rüthing, and Steffen's lifetime optimal algorithm for performing nonspeculative classic PRE), and CMP-PRE (Bodik, Gupta, and Soffa's PRE algorithm based on code-motion preventing (CMP) regions, which is speculative but not computationally optimal). We report and analyze our experimental results, obtained from both actual program execution and instrumentation, for all 22 C, C++ and FORTRAN 77 benchmarks from SPECcpu2000 on an Itanium 2 computer system. Our results show that MC-PRE (or MC-PREcopt) is capable of eliminating more partial redundancies than both LCM and CMP-PRE (especially in functions with complex control flow), and, in addition, MC-PRE inserts temporaries with shorter lifetimes than MC-PREcopt. Each of both benefits has contributed to the performance improvements in benchmark programs at the costs of only small compile-time and code-size increases in some benchmarks.