Integrated temporal and spatial scheduling for extended operand clustered VLIW processors

Authors:
Rahul Nagpal;Y. N. Srikant
Affiliations:
Indian Institute of Science, Bangalore, India;Indian Institute of Science, Bangalore, India
Venue:
Proceedings of the 1st conference on Computing frontiers
Year:
2004

Citing 12
Cited 12

Force-directed scheduling in automatic data path synthesis

DAC '87 Proceedings of the 24th ACM/IEEE Design Automation Conference
The priority-based coloring approach to register allocation

ACM Transactions on Programming Languages and Systems (TOPLAS)
MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Unified assign and schedule: a new approach to scheduling for clustered register file microarchitectures

MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Lx: a technology platform for customizable VLIW embedded processing

Proceedings of the 27th annual international symposium on Computer architecture
Graph-partitioning based instruction scheduling for clustered processors

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Cluster assignment for high-performance embedded VLIW processors

ACM Transactions on Design Automation of Electronic Systems (TODAES)
Will Physical Scalability Sabotage Performance Gains?

Computer
The TigerSHARC DSP Architecture

IEEE Micro
Inter-Cluster Communication Models for Clustered VLIW Processors

HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Instruction Scheduling for Clustered VLIW DSPs

PACT '00 Proceedings of the 2000 International Conference on Parallel Architectures and Compilation Techniques
CARS: A New Code Generation Framework for Clustered ILP Processors

HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture

Compiler-assisted leakage energy optimization for clustered VLIW architectures

EMSOFT '06 Proceedings of the 6th ACM & IEEE International conference on Embedded software
Inter-cluster communication in VLIW architectures

ACM Transactions on Architecture and Code Optimization (TACO)
INTACTE: an interconnect area, delay, and energy estimation tool for microarchitectural explorations

CASES '07 Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems
Optimal vs. heuristic integrated code generation for clustered VLIW architectures

SCOPES '08 Proceedings of the 11th international workshop on Software & compilers for embedded systems
Compiler-assisted instruction decoder energy optimization for clustered VLIW architectures

HiPC'07 Proceedings of the 14th international conference on High performance computing
Compiler-assisted power optimization for clustered VLIW architectures

Parallel Computing
Exploring energy-performance trade-offs for heterogeneous interconnect clustered VLIW processors

HiPC'06 Proceedings of the 13th international conference on High Performance Computing
Integrated Code Generation for Loops

ACM Transactions on Embedded Computing Systems (TECS)
Criticality guided energy aware speculation for speculative multithreaded processors

Parallel Computing
Compiler-assisted energy optimization for clustered VLIW processors

Journal of Parallel and Distributed Computing
SCRF: a hybrid register file architecture

PaCT'07 Proceedings of the 9th international conference on Parallel Computing Technologies
A constraint programming approach for integrated spatial and temporal scheduling for clustered architectures

ACM Transactions on Embedded Computing Systems (TECS)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Centralized register file architectures scale poorly in terms of clock rate, chip area, and power consumption and are thus not suitable for consumer electronic devices. The consequence is the emergence of architectures having many interconnected clusters each with a separate register file and a few functional units. Among the many inter-cluster communication models proposed, the extended operand model extends some of operand fields of instruction with a cluster specifier and allows an instruction to read some of the operands from other clusters without any extra cost.Scheduling for clustered processors involves spatial concerns (where to schedule) as well as temporal concerns (when to schedule). A scheduler is responsible for resolving the conflicting requirements of aggressively exploiting the parallelism offered by hardware and limiting the communication among clusters to available slots. This paper proposes an integrated spatial and temporal scheduling algorithm for extended operand clustered VLIW processors and evaluates its effectiveness in improving the run time performance of the code without code size penalty.