Bulldog: a compiler for VLSI architectures
Bulldog: a compiler for VLSI architectures
Register allocation via graph coloring
Register allocation via graph coloring
ACM Transactions on Programming Languages and Systems (TOPLAS)
Optimal and near-optimal global register allocations using 0–1 integer programming
Software—Practice & Experience
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Linear scan register allocation
ACM Transactions on Programming Languages and Systems (TOPLAS)
Lx: a technology platform for customizable VLIW embedded processing
Proceedings of the 27th annual international symposium on Computer architecture
Instruction scheduling for clustered VLIW architectures
ISSS '00 Proceedings of the 13th international symposium on System synthesis
Parallel processing: a smart compiler and a dumb machine
SIGPLAN '84 Proceedings of the 1984 SIGPLAN symposium on Compiler construction
Modulo scheduling with integrated register spilling for clustered VLIW architectures
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Cluster assignment for high-performance embedded VLIW processors
ACM Transactions on Design Automation of Electronic Systems (TODAES)
The TigerSHARC DSP Architecture
IEEE Micro
A Unified Modulo Scheduling and Register Allocation Technique for Clustered Processors
Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques
Instruction Scheduling for Clustered VLIW DSPs
PACT '00 Proceedings of the 2000 International Conference on Parallel Architectures and Compilation Techniques
CARS: A New Code Generation Framework for Clustered ILP Processors
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
A generalized algorithm for graph-coloring register allocation
Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation
Scratchpad memories vs locked caches in hard real-time systems: a quantitative comparison
Proceedings of the conference on Design, automation and test in Europe
WCET-Directed Dynamic Scratchpad Memory Allocation of Data
ECRTS '07 Proceedings of the 19th Euromicro Conference on Real-Time Systems
Pragmatic integrated scheduling for clustered VLIW architectures
Software—Practice & Experience
Register allocation by puzzle solving
Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
WCET-driven Cache-based Procedure Positioning Optimizations
ECRTS '08 Proceedings of the 2008 Euromicro Conference on Real-Time Systems
Minimizing WCET for Real-Time Embedded Systems via Static Instruction Cache Locking
RTAS '09 Proceedings of the 2009 15th IEEE Symposium on Real-Time and Embedded Technology and Applications
WCET-aware register allocation based on graph coloring
Proceedings of the 46th Annual Design Automation Conference
Optimizing scheduling and intercluster connection for application-specific DSP processors
IEEE Transactions on Signal Processing
Energy efficient joint scheduling and multi-core interconnect design
Proceedings of the 2010 Asia and South Pacific Design Automation Conference
Task Assignment with Cache Partitioning and Locking for WCET Minimization on MPSoC
ICPP '10 Proceedings of the 2010 39th International Conference on Parallel Processing
Joint task assignment and cache partitioning with cache locking for WCET minimization on MPSoC
Journal of Parallel and Distributed Computing
Register allocation for programs in SSA-Form
CC'06 Proceedings of the 15th international conference on Compiler Construction
Register allocation via coloring
Computer Languages
Energy-Efficient Joint Scheduling and Application-Specific Interconnection Design
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Loop Transforming for Reducing Data Alignment on Multi-Core SIMD Processors
Journal of Signal Processing Systems
Hi-index | 0.00 |
Worst-Case Execution Time (WCET) is one of the most important metrics in real-time embedded system design. For embedded systems with clustered VLIW architecture, register allocation, instruction scheduling, and cluster assignment are three key activities to pursue code optimization which have profound impact on WCET. At the same time, these three activities exhibit a phase ordering problem: Independently performing register allocation, scheduling and cluster assignment could have a negative effect on the other phases, thereby generating sub-optimal compiled codes. In this paper, a compiler level optimization, namely WCET-aware Re-scheduling Register Allocation (WRRA), is proposed to achieve WCET minimization for real-time embedded systems with clustered VLIW architecture. The novelty of the proposed approach is that the effects of register allocation, instruction scheduling and cluster assignment on the quality of generated code are taken into account for WCET minimization. These three compilation processes are integrated into a single phase to obtain a balanced result. The proposed technique is implemented in Trimaran 4.0. The experimental results show that the proposed technique can reduce WCET effectively, by 33% on average.