Bulldog: a compiler for VLSI architectures
Bulldog: a compiler for VLSI architectures
Compilers: principles, techniques, and tools
Compilers: principles, techniques, and tools
A VLIW architecture for a trace scheduling compiler
ASPLOS II Proceedings of the second international conference on Architectual support for programming languages and operating systems
Software pipelining: an effective scheduling technique for VLIW machines
PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
IMPACT: an architectural framework for multiple-instruction-issue processors
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
MIPS RISC architectures
The superblock: an effective technique for VLIW and superscalar compilation
The Journal of Supercomputing - Special issue on instruction-level parallelism
Very Long Instruction Word architectures and the ELI-512
ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
Percolation Scheduling: A Parallel Compilation Technique
Percolation Scheduling: A Parallel Compilation Technique
Trailblazing: A Hierarchical Approach to Percolation Scheduling
ICPP '93 Proceedings of the 1993 International Conference on Parallel Processing - Volume 02
Trace Scheduling: A Technique for Global Microcode Compaction
IEEE Transactions on Computers
Hi-index | 0.00 |
Superscalar and Very Long Instruction Word (VLIW) architectures exploit fine-grain parallelism to achieve better performance. Static scheduling techniques, such as trace scheduling [1] and superblock scheduling [2], can effectively produce compact code for these architectures. In this paper, we present an analytical approach for bookkeeping in code scheduling that alleviates the coding complexity and instruction duplication limitations of the previous approaches. We describe techniques that allow instructions to be moved around loop and if-then-else constructs using global information. We also show that according to the classification of the register sets, certain instructions can be moved around subroutine calls, since their register live ranges can be predetermined across the procedural boundaries at compile time. Performance is compared with respect to the speed-up, the code size and the scheduling time. Experimental results indicate that the code growth and the speed-up are both improved with a small increase in scheduling time.