Partitioned Schedules for Clustered VLIW Architectures

Authors:
Affiliations:
Venue:
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Year:
1998

Citing 10
Cited 8

The Cydra 5 Departmental Supercomputer: Design Philosophies, Decisions, and Trade-Offs

Computer
Partitioned register files for VLIWs: a preliminary analysis of tradeoffs

MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
A new technique for exploiting regularity in data path synthesis

EURO-DAC '94 Proceedings of the conference on European design automation
Partitioned register file for TTAs

Proceedings of the 28th annual international symposium on Microarchitecture
Unrolling-based optimizations for modulo scheduling

Proceedings of the 28th annual international symposium on Microarchitecture
Assignment of storage values to sequential read-write memories

EURO-DAC '96/EURO-VHDL '96 Proceedings of the conference on European design automation
Allocating Lifetimes to Queues in Software Pipelined Architectures

Euro-Par '97 Proceedings of the Third International Euro-Par Conference on Parallel Processing
Very Long Instruction Word architectures and the ELI-512

ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
Some scheduling techniques and an easily schedulable horizontal architecture for high performance scientific computing

MICRO 14 Proceedings of the 14th annual workshop on Microprogramming
An Approach to Scientific Array Processing: The Architectural Design of the AP-120B/FPS-164 Family

Computer

Two-level hierarchical register file organization for VLIW processors

Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Tailoring pipeline bypassing and functional unit mapping to application in clustered VLIW architectures

CASES '01 Proceedings of the 2001 international conference on Compilers, architecture, and synthesis for embedded systems
Evaluating the Use of Register Queues in Software Pipelined Loops

IEEE Transactions on Computers - Special issue on the parallel architecture and compilation techniques conference
Modulo scheduling with integrated register spilling for clustered VLIW architectures

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Exploiting Pseudo-Schedules to Guide Data Dependence Graph Partitioning

Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Software and hardware techniques to optimize register file utilization in VLIW architectures

International Journal of Parallel Programming
Facilitating compiler optimizations through the dynamic mapping of alternate register structures

CASES '07 Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems
Modulo scheduling without overlapped lifetimes

Proceedings of the 2009 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents results on a new approach to partitioning a modulo-scheduled loop for distributed execution on parallel clusters of functional units organized as a VLIW machine. A distinctive characteristic of this architecture is the use of register files organized by means of queues, which results in a number of advantages over conventional schemes, but also requires the development of specific compiling and hardware features. We have investigated a scheme based on copy operations to deal with data values to be consumed more than once during loop execution. Experiments with loop unrolling were also performed in order to optimize both loop execution and the use of machine resources. A partitioning algorithm has been implemented to perform some experiments with the clustered architecture model, an organization widely accepted as being essential for very wide issue machines.