Improved bitwidth-aware variable packing

Authors:
V. Krishna Nandivada;Rajkishore Barik
Affiliations:
IIT Madras, India;Intel Labs, Santa Clara, CA
Venue:
ACM Transactions on Architecture and Code Optimization (TACO)
Year:
2008

Citing 21
Cited 0

Bin packing with divisible item sizes

Journal of Complexity
Efficiently computing static single assignment form and the control dependence graph

ACM Transactions on Programming Languages and Systems (TOPLAS)
Improvements to graph coloring register allocation

ACM Transactions on Programming Languages and Systems (TOPLAS)
Iterated register coalescing

ACM Transactions on Programming Languages and Systems (TOPLAS)
Array SSA form and its use in parallelization

POPL '98 Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Advanced compiler design and implementation

Advanced compiler design and implementation
Scheduling with conflicts, and applications to traffic signal control

Proceedings of the seventh annual ACM-SIAM symposium on Discrete algorithms
Bidwidth analysis with application to silicon compilation

PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Optimal spilling for CISC machines with few registers

Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation
Fast copy coalescing and live-range identification

PLDI '02 Proceedings of the ACM SIGPLAN 2002 Conference on Programming language design and implementation
Computers and Intractability: A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness
Bitwidth aware global register allocation

POPL '03 Proceedings of the 30th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Concurrent Static Single Assignment Form and Constant Propagation for Explicitly Parallel Programs

LCPC '97 Proceedings of the 10th International Workshop on Languages and Compilers for Parallel Computing
Register allocation & spilling via graph coloring

SIGPLAN '82 Proceedings of the 1982 SIGPLAN symposium on Compiler construction
Windows scheduling as a restricted version of Bin Packing

SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
Physical Register Inlining

Proceedings of the 31st annual international symposium on Computer architecture
Optimistic register coalescing

ACM Transactions on Programming Languages and Systems (TOPLAS)
Register Packing: Exploiting Narrow-Width Operands for Reducing Register File Pressure

Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
A Small, Fast and Low-Power Register File by Bit-Partitioning

HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
On the Complexity of Register Coalescing

Proceedings of the International Symposium on Code Generation and Optimization
Enhanced bitwidth-aware register allocation

CC'06 Proceedings of the 15th international conference on Compiler Construction

Quantified Score

Hi-index	0.00

Visualization

Abstract

Bitwidth-aware register allocation has caught the attention of researchers aiming to effectively reduce the number of variables spilled into memory. For general-purpose processors, this improves the execution time performance and reduces runtime memory requirements (which in turn helps in the compilation of programs targeted to systems with constrained memory). Additionally, bitwidth-aware register allocation has been effective in reducing power consumption in embedded processors. One of the key components of bitwidth-aware register allocation is the variable packing algorithm that packs multiple narrow-width variables into one physical register. Tallam and Gupta [2003] have proved that optimal variable packing is an NP-complete problem for arbitrary-width variables and have proposed an approximate solution. In this article, we analyze the complexity of the variable packing problem and present three enhancements that improve the overall packing of variables. In particular, the improvements we describe are: (a) Width Static Single Assignment (W-SSA) form representation that splits the live range of a variable into several fixed-width live ranges (W-SSA) variables); (b) PoTR Representation - use of powers-of-two representation for bitwidth information for W-SSA variables. Our empirical results have shown that the associated bit wastage resulting from the overapproximation of the widths of variables to the nearest next power of two is a small fraction compared to the total number of bits in use (≈13%). The main advantage of this representation is that it leads to optimal variable packing in polynomial time; (c) Combined Packing and Coalescing - we discuss the importance of coalescing (combining variables whose live ranges do not interfere) in the context of variable packing and present an iterative algorithm to perform coalescing and packing of W-SSA variables represented in PoTR. Our experimental results show up to 76.00% decrease in the number of variables compared to the number of variables in the input program in Single Static Assignment (SSA) form. This reduction in the number of variables led to a significant reduction in dynamic spilling, packing, and unpacking instructions.