Optimization for the Intel® Itanium® architecture register stack

  • Authors:
  • Alex Settle;Daniel A. Connors;Gerolf Hoflehner;Dan Lavery

  • Affiliations:
  • University of Colorado at Boulder;University of Colorado at Boulder;Intel Corporation, Santa Clara, CA;Intel Corporation, Santa Clara, CA

  • Venue:
  • Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

The Intel® Itanium® architecture contains a number of innovative compiler-controllable features designed to exploit instruction level parallelism. New code generation and optimization techniques are critical to the application of these features to improve processor performance. For instance, the Itanium® architecture provides a compiler-controllable virtual register stack to reduce the penalty of memory accesses associated with procedure calls. The ltanium® Register Stack Engine (RSE) transparently manages the register stack and saves and restores physical registers to and from memory as needed. Existing code generation techniques for the register stack aggressively allocate virtual registers without regard to the register pressure on different control-flow paths. As such, applications with large data sets may stress the RSE, and cause substantial execution delays due to the high number of register saves and restores. Since the Itanium® architecture is developed around Explicitly Parallel Instruction Computing (EPIC) concepts, solutions to increasing the register stack efficiency favor code generation techniques rather than hardware approaches.