Dynamic rescheduling: a technique for object code compatibility in VLIW architectures
Proceedings of the 28th annual international symposium on Microarchitecture
Journal of Parallel and Distributed Computing
IEEE Micro
The Importance of Prepass Code Scheduling for Superscalar and Superpipelined Processors
IEEE Transactions on Computers
Pipelining and Bypassing in a VLIW Processor
IEEE Transactions on Parallel and Distributed Systems
Reducing code size in VLIW instruction scheduling
Journal of Embedded Computing - Low-power Embedded Systems
Reducing instruction bit-width for low-power VLIW architectures
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Hi-index | 0.05 |
This paper presents a new ILP processor architecture called Compressed VLIW (CVLIW). The CVLIW processor constructs a sequence of long instructions by removing nearly all NOPs (No OPerations) and LNOPs (Long NOPs) from VLIW code. The CVLIW processor individually schedules each instruction within long instructions using functional unit and dynamic scheduler pairs. Every dynamic scheduler in the CVLIW processor individually checks for data dependencies and resource collisions while scheduling each instruction. In this paper, we simulate the architecture and show that the CVLIW processor performs better than the VLIW processor for a wide range of cache sizes and across various numerical benchmark applications. These performance gains of the CVLIW processor result from individual instruction scheduling and size reduction of object code. Even though we assume a cache with a zero miss rate, the CVLIW's performance is still 9%~15% higher than that of the VLIW processor regardless of benchmark applications.