Inter-procedural stacked register allocation for itanium® like architecture

  • Authors:
  • Liu Yang;Sun Chan;G. R. Gao;Roy Ju;Guei-Yuan Lueh;Zhaoqing Zhang

  • Affiliations:
  • Institute of Computing Technology, CAS, Beijing, P.R.China;Microprocessor Research Labs, Intel Labs,Santa Clara,CA;University of Delaware;Microprocessor Research Labs, Intel Labs,Santa Clara,CA;Microprocessor Research Labs, Intel Labs,Santa Clara,CA;Institute of Computing Technology, CAS, Beijing, P.R.China

  • Venue:
  • ICS '03 Proceedings of the 17th annual international conference on Supercomputing
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

A hardware managed register stack, Register Stack Engine (RSE), is implemented in Itanium® architecture to provide a unified and flexible register structure to software. The compiler allocates each procedure a register stack frame with its size explicitly specified using an alloc instruction. When the total number of registers used by the procedures on the call stack exceeds the number of physical registers, RSE performs automatically register overflows and fills to ensure that the current procedure has its requested registers available. The virtual register stack frames and RSE alleviate the need of explicit spills by the compiler, but our experimental results indicate that a trade-off exists between using stacked registers and explicit spills under high register pressure due to the uneven cost between them. In this work, we introduce the stacked register quota assignment problem based on the observation that reducing stacked register usage in some procedures could reduce the total memory access time of spilling registers, which includes the time caused by the loads/stores due to explicit register spills and RSE overflows/fills. We propose a new inter-procedural algorithm to solve the problem by allocating stacked registers across procedures based on a quantitative cost model. The results show that our approach can improve performance significantly for the programs with high RSE overflow cost, e.g. perlbmk and crafty, improved by 14% and 3.7%, respectively.