Enhanced bitwidth-aware register allocation

Authors:
Rajkishore Barik;Vivek Sarkar
Affiliations:
IBM T.J. Watson Research Center;IBM T.J. Watson Research Center
Venue:
CC'06 Proceedings of the 15th international conference on Compiler Construction
Year:
2006

Citing 16
Cited 2

Improvements to graph coloring register allocation

ACM Transactions on Programming Languages and Systems (TOPLAS)
Memory access coalescing: a technique for eliminating redundant memory accesses

PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
The filter cache: an energy efficient memory structure

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Array SSA form and its use in parallelization

POPL '98 Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Quality and speed in linear-scan register allocation

PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Linear scan register allocation

ACM Transactions on Programming Languages and Systems (TOPLAS)
Bidwidth analysis with application to silicon compilation

PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Exploiting superword level parallelism with multimedia instruction sets

PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Bit section instruction set extension of ARM for embedded applications

CASES '02 Proceedings of the 2002 international conference on Compilers, architecture, and synthesis for embedded systems
Bitwidth aware global register allocation

POPL '03 Proceedings of the 30th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
BitValue Inference: Detecting and Exploiting Narrow Bitwidth Computations

Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
A Representation for Bit Section Based Analysis and Optimization

CC '02 Proceedings of the 11th International Conference on Compiler Construction
Register allocation & spilling via graph coloring

SIGPLAN '82 Proceedings of the 1982 SIGPLAN symposium on Compiler construction
Dynamically Exploiting Narrow Width Operands to Improve Processor Power and Performance

HPCA '99 Proceedings of the 5th International Symposium on High Performance Computer Architecture
Compiler Analysis of the Value Ranges for Variables

IEEE Transactions on Software Engineering
Speculative subword register allocation in embedded processors

LCPC'04 Proceedings of the 17th international conference on Languages and Compilers for High Performance Computing

Subregion analysis and bounds check elimination for high level arrays

CC'11/ETAPS'11 Proceedings of the 20th international conference on Compiler construction: part of the joint European conferences on theory and practice of software
Improved bitwidth-aware variable packing

ACM Transactions on Architecture and Code Optimization (TACO)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Embedded processors depend on register files for performance, just like general-purpose processors in desktop and server systems. However, unlike general-purpose processors, the power consumption of register files poses a significant challenge for embedded processors, making it desirable for embedded processors to use as few registers as possible. Past research has indicated the potential for leveraging bitwidth analysis and bitwidth-aware register allocation to reduce register usage in embedded applications. This paper makes the following contributions in evaluating and enhancing bitwidth-aware register allocation for embedded applications. First, we compare the Tallam-Gupta bitwidth analysis with an idealized limit study, and show significant opportunities for enhancements. Second, we show how bitwidth-aware register allocation can be enhanced by enhanced bitwidth analysis for scalar and array variables, and also by enhanced coalescing of variables. Third, we use our prototype implementation of bitwidth-aware register allocation in gcc to compare the number of dynamic spill load/store instructions resulting from a) bitwidth-unaware allocation, b) bitwidth-aware allocation, c) enhanced bitwidth-aware allocation, and d) ideal profile-driven bitwidth-aware allocation. Our results show that our enhancements can reduce the number of dynamic spill load/store instructions to between 3% and 27% of the number obtained from the Tallam-Gupta algorithm.