Realistic scheduling: compaction for pipelined architectures

Authors:
Alexandru Nicolau;Roni Potasman
Affiliations:
Information and Computer Science Department, University of California, Irvine, CA;Dept. of Electrical and Computer Engineering, University of California, Irvine, CA
Venue:
MICRO 23 Proceedings of the 23rd annual workshop and symposium on Microprogramming and microarchitecture
Year:
1990

Citing 11
Cited 8

Bulldog: a compiler for VLSI architectures

Bulldog: a compiler for VLSI architectures
Compilation for a high-performance systolic array

SIGPLAN '86 Proceedings of the 1986 SIGPLAN symposium on Compiler construction
A VLIW architecture for a trace Scheduling Compiler

IEEE Transactions on Computers - Special issue on architectural support for programming languages and operating systems
A new compilation technique for parallelizing loops with unpredictable branches on a VLIW architecture

Selected papers of the second workshop on Languages and compilers for parallel computing
A compilation technique for software pipelining of loops with conditional jumps

MICRO 20 Proceedings of the 20th annual workshop on Microprogramming
A global resource-constrained parallelization technique

ICS '89 Proceedings of the 3rd international conference on Supercomputing
A Development Environment for Horizontal Microcode

IEEE Transactions on Software Engineering
Perfect Pipelining: A New Loop Parallelization Technique

ESOP '88 Proceedings of the 2nd European Symposium on Programming
Efficient code generation for horizontal architectures: Compiler techniques and architectural support

ISCA '82 Proceedings of the 9th annual symposium on Computer Architecture
Percolation Scheduling: A Parallel Compilation Technique

Percolation Scheduling: A Parallel Compilation Technique
Compaction-based parallelization

Compaction-based parallelization

Software pipelining: an evaluation of enhanced pipelining

MICRO 24 Proceedings of the 24th annual international symposium on Microarchitecture
Register allocation for software pipelined loops

PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Code generation schema for modulo scheduled loops

MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Foresighted Instruction Scheduling Under Timing Constraints

IEEE Transactions on Computers
Software pipelining

ACM Computing Surveys (CSUR)
Modulo scheduling for the TMS320C6x VLIW DSP architecture

Proceedings of the ACM SIGPLAN 1999 workshop on Languages, compilers, and tools for embedded systems
Properties of Rescheduling Size Invariance for Dynamic Rescheduling-Based VLIW Cross-Generation Compatibility

IEEE Transactions on Computers
Software Pipelining: Petri Net Pacemaker

PACT '93 Proceedings of the IFIP WG10.3. Working Conference on Architectures and Compilation Techniques for Fine and Medium Grain Parallelism

Quantified Score

Hi-index	0.01

Visualization

Abstract

This paper presents an approach for the development of microcode for parallel and pipelined machines. The approach is geared towards mapping programs with real-time constraints and/or massive time requirements onto synchronous parallel computers (VLIW's, superscalars and microengines). In order to exploit the maximal parallelism from such machines, both spatial (multiple functional units) and temporal (pipelined) capabilities of the architecture need to be exploited. Until now, parallelizing compilers for parallel machines have not fully taken advantage of pipelining capabilities: they have either assumed that all operations take one cycle or have added pipelining as an after thought. These approaches restrict the speed-up. We built a system which is based on a set of low-level transformations called Pipelined Percolation Scheduling (PPS). The transformations integrate the exploitation of temporal and spatial parallelism. Although these low-level transformations are integrated into our system they are self-contained and may be used separately by applying 'higher level' transformations (on top of PPS) to optimize performance for a target architecture.