Register allocation via graph coloring
Register allocation via graph coloring
The Journal of Supercomputing - Special issue on instruction-level parallelism
Advanced compiler design and implementation
Advanced compiler design and implementation
A bandwidth-efficient architecture for media processing
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Modulo scheduling with integrated register spilling for clustered VLIW architectures
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
A Method for Register Allocation to Loops in Multiple Register File Architectures
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
A Unified Modulo Scheduling and Register Allocation Technique for Clustered Processors
Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques
Efficient Scheduling of DSP Code on Processors with Distributed Register Files
Proceedings of the 12th international symposium on System synthesis
A programming system for the imagine media processor
A programming system for the imagine media processor
The vlsi implementation and evaluation of area- and energy-efficient streaming media processors
The vlsi implementation and evaluation of area- and energy-efficient streaming media processors
Merrimac: Supercomputing with Streams
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Prematerialization: reducing register pressure for free
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
A 64-bit stream processor architecture for scientific applications
Proceedings of the 34th annual international symposium on Computer architecture
Register allocation on stream processor with local register file
ACSAC'06 Proceedings of the 11th Asia-Pacific conference on Advances in Computer Systems Architecture
Hi-index | 0.00 |
In this paper we describe load scheduling, a novel method that balances load among register files by residual resources. Load scheduling can reduce register pressure for clustered VLIW processors with distributed register files while not increasing VLIW scheduling length. We have implemented load scheduling in compiler for Imagine and FT64 stream processors. The result shows that the proposed technique effectively reduces the number of variables spilled to memory, and can even eliminate it. The algorithm presented in this paper is extremely efficient in embedded processor with limited register resource because it can improve registers utilization instead of increasing the requirement for the number of registers.