An efficient heuristic for instruction scheduling on clustered vliw processors

Authors:
Xuemeng Zhang;Hui Wu;Jingling Xue
Affiliations:
The University of New South Wales, Sydney, Australia;The University of New South Wales, Sydney, Australia;The University of New South Wales, Sydney, Australia
Venue:
CASES '11 Proceedings of the 14th international conference on Compilers, architectures and synthesis for embedded systems
Year:
2011

Citing 14
Cited 2

Bulldog: a compiler for VLSI architectures

Bulldog: a compiler for VLSI architectures
Efficient instruction scheduling for a pipelined architecture

SIGPLAN '86 Proceedings of the 1986 SIGPLAN symposium on Compiler construction
Unified assign and schedule: a new approach to scheduling for clustered register file microarchitectures

MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Instruction scheduling for clustered VLIW architectures

ISSS '00 Proceedings of the 13th international symposium on System synthesis
Modulo scheduling with integrated register spilling for clustered VLIW architectures

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Cluster assignment for high-performance embedded VLIW processors

ACM Transactions on Design Automation of Electronic Systems (TODAES)
A Unified Modulo Scheduling and Register Allocation Technique for Clustered Processors

Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques
Optimizing Loop Performance for Clustered VLIW Architectures

Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Inter-Cluster Communication Models for Clustered VLIW Processors

HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Instruction Scheduling for Clustered VLIW DSPs

PACT '00 Proceedings of the 2000 International Conference on Parallel Architectures and Compilation Techniques
CARS: A New Code Generation Framework for Clustered ILP Processors

HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Computer Architecture, Fourth Edition: A Quantitative Approach

Computer Architecture, Fourth Edition: A Quantitative Approach
Pragmatic integrated scheduling for clustered VLIW architectures

Software—Practice & Experience
AGAMOS: A Graph-Based Approach to Modulo Scheduling for Clustered Microarchitectures

IEEE Transactions on Computers

LUCAS: latency-adaptive unified cluster assignment and instruction scheduling

Proceedings of the 14th ACM SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systems
CAeSaR: unified cluster-assignment scheduling and communication reuse for clustered VLIW processors

Proceedings of the 2013 International Conference on Compilers, Architectures and Synthesis for Embedded Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Clustering is a well-known technique for improving the scalability of classical VLIW processors. A clustered VLIW processor consists of multiple clusters, each of which has its own register file and functional units. This paper presents a novel phase coupled priority-based heuristic for scheduling a set of instructions in a basic block on a clustered VLIW processor. Our heuristic converts the instruction scheduling problem into the problem of scheduling a set of instructions with a common deadline. The priority of each instruction vi is the lmax(vi)-successor-tree-consistent deadline which is the upper bound on the latest completion time of vi in any feasible schedule for a relaxed problem where the precedence-latency constraints between vi and all its successors, as well as the resource constraints are considered. We have simulated our heuristic, UAS heuristic and Integrated heuristic on the 808 basic blocks taken from the MediaBench II benchmark suite using six processor models. On average, for the six processor models, our heuristic improves 25%, 25%, 33%, 23%, 26%, 27% over UAS heuristic, respectively, and 15%, 16%, 15%, 9%, 20%, 8% over Integrated heuristic, respectively.