Optimizing stack frame accesses for processors with restricted addressing modes
Software—Practice & Experience
Storage assignment to decrease code size
ACM Transactions on Programming Languages and Systems (TOPLAS)
Software pipelining showdown: optimal vs. heuristic methods in a production compiler
PLDI '96 Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation
Exceeding the dataflow limit via value prediction
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Algorithms for address assignment in DSP code generation
Proceedings of the 1996 IEEE/ACM international conference on Computer-aided design
Analysis and evaluation of address arithmetic capabilities in custom DSP architectures
DAC '97 Proceedings of the 34th annual Design Automation Conference
Improving the accuracy and performance of memory communication through renaming
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Memory data organization for improved cache performance in embedded processor applications
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Optimal and near-optimal global register allocations using 0–1 integer programming
Software—Practice & Experience
Compiler-directed early load-address generation
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
A uniform optimization technique for offset assignment problems
Proceedings of the 11th international symposium on System synthesis
Storage assignment optimizations to generate compact and efficient code on embedded DSPs
Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
Optimal spilling for CISC machines with few registers
Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation
Register Allocation for Banked Register File
OM '01 Proceedings of the 2001 ACM SIGPLAN workshop on Optimization of middleware and distributed systems
C Compiler Design for an Industrial Network Processor
OM '01 Proceedings of the 2001 ACM SIGPLAN workshop on Optimization of middleware and distributed systems
Storage allocation for embedded processors
CASES '01 Proceedings of the 2001 international conference on Compilers, architecture, and synthesis for embedded systems
ARM Architecture Reference Manual
ARM Architecture Reference Manual
NetBench: a benchmarking suite for network processors
Proceedings of the 2001 IEEE/ACM international conference on Computer-aided design
An Integer Linear Programming Model of Software Pipelining for the MIPS R8000 Processor
PaCT '97 Proceedings of the 4th International Conference on Parallel Computing Technologies
Evaluation of Algorithms for Local Register Allocation
CC '99 Proceedings of the 8th International Conference on Compiler Construction, Held as Part of the European Joint Conferences on the Theory and Practice of Software, ETAPS'99
Taming the IXP network processor
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Instruction combining for coalescing memory accesses using global code motion
MSP '04 Proceedings of the 2004 workshop on Memory system performance
Exploiting parallelism in memory operations for code optimization
LCPC'04 Proceedings of the 17th international conference on Languages and Compilers for High Performance Computing
SARA: combining stack allocation and register allocation
CC'06 Proceedings of the 15th international conference on Compiler Construction
A framework for end-to-end verification and evaluation of register allocators
SAS'07 Proceedings of the 14th international conference on Static Analysis
Hi-index | 0.00 |
Processors such as StrongARM and memory such as SDRAM enable efficient execution of multiple loads and stores in a single instruction. This is particularly useful in connection with register allocation where spill code may need to save and restore multiple registers. Until now, there has been no effective strategy for utilizing this to its full potential. In this paper we investigate the use of SDRAM for optimization of spill code. The core of the problem is to arrange the variables in the spill area such that loading to and storing from the SDRAM is optimally efficient. We show that the problem is NP-complete and present a method based on integer linear programming (ILP) to solve the problem. We have implemented our approach as an additional phase in a gcc-based compiler for the StrongARM core of Intel's IXP--1200 network processor. Our optimizer, SLA (stack location allocator), rearranges the scalar variables so that memory accesses can be made cheaper. Our experimental results show that our ILP-based method is efficient and that the code generated for our benchmarks runs 0.8--15.1% faster than the code produced by the original compiler with --O2 optimization. Our SLA phase is guaranteed to not deteriorate the execution-time performance and can be configured such as not to increase the code size.