A data locality optimizing algorithm
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Access normalization: loop restructuring for NUMA computers
ACM Transactions on Computer Systems (TOCS)
Array SSA form and its use in parallelization
POPL '98 Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Fusion-based register allocation
ACM Transactions on Programming Languages and Systems (TOPLAS)
Data Relation Vectors: A New Abstraction for Data Optimizations
IEEE Transactions on Computers - Special issue on the parallel architecture and compilation techniques conference
Efficient global register allocation for minimizing energy consumption
ACM SIGPLAN Notices
Memory Issues in Embedded Systems-on-Chip: Optimizations and Exploration
Memory Issues in Embedded Systems-on-Chip: Optimizations and Exploration
Register allocation & spilling via graph coloring
SIGPLAN '82 Proceedings of the 1982 SIGPLAN symposium on Compiler construction
Transformations for Improving Data Access Locality in Non-Perfectly Nested Loops
PACT '98 Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques
Register allocation and spilling via graph coloring
ACM SIGPLAN Notices - Best of PLDI 1979-1999
Coloring heuristics for register allocation
ACM SIGPLAN Notices - Best of PLDI 1979-1999
Code Generation in the Polyhedral Model Is Easier Than You Think
Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Cache aware optimization of stream programs
LCTES '05 Proceedings of the 2005 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Stream Register Files with Indexed Access
HPCA '04 Proceedings of the 10th International Symposium on High Performance Computer Architecture
Embedded System Design
VICTORIA: VMX indirect compute technology oriented towards in-line acceleration
Proceedings of the 3rd conference on Computing frontiers
Compiling for stream processing
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Semi-automatic composition of loop transformations for deep parallelism and memory hierarchies
International Journal of Parallel Programming
Very wide register: an asymmetric register file organization for low power embedded processors
Proceedings of the conference on Design, automation and test in Europe
Compiling for an indirect vector register architecture
Proceedings of the 5th conference on Computing frontiers
Register allocation by puzzle solving
Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
Optimizing scientific application loops on stream processors
Proceedings of the 2008 ACM SIGPLAN-SIGBED conference on Languages, compilers, and tools for embedded systems
Hi-index | 0.00 |
Low power design criteria for embedded systems have lead to many innovative architectures. One of the core architectural changes that have come in the recent past are streaming registers. These architectures have been shown to be both power efficient and performance efficient. However code has to be efficiently mapped on them to make maximal use of their potential. This paper introduces a novel technique for compiling C code on streaming registers. The proposed technique not only uses the temporal locality in arrays but also spatial locality to map code on streaming registers. The proposed Stream Register Allocation (SARA) technique is also shown to provide good mapping efficiency as well as it is shown to be scalable on realistic applications.