Bulldog: a compiler for VLSI architectures
Bulldog: a compiler for VLSI architectures
Compilers: principles, techniques, and tools
Compilers: principles, techniques, and tools
Numerical recipes: the art of scientific computing
Numerical recipes: the art of scientific computing
A VLIW architecture for a trace scheduling compiler
ASPLOS II Proceedings of the second international conference on Architectual support for programming languages and operating systems
Software pipelining: an effective scheduling technique for VLIW machines
PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Overlapped loop support in the Cydra 5
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
A Fortran compiler for the FPS-164 scientific computer
SIGPLAN '84 Proceedings of the 1984 SIGPLAN symposium on Compiler construction
Conversion of control dependence to data dependence
POPL '83 Proceedings of the 10th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Very Long Instruction Word architectures and the ELI-512
ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
MICRO 14 Proceedings of the 14th annual workshop on Microprogramming
Control and data dependence for program transformations.
Control and data dependence for program transformations.
Register allocation for software pipelined loops
PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Sentinel scheduling for VLIW and superscalar processors
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Effective compiler support for predicated execution using the hyperblock
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Code generation schema for modulo scheduled loops
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Enhanced modulo scheduling for loops with conditional branches
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Lifetime-sensitive modulo scheduling
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Sentinel scheduling: a model for compiler-controlled speculative execution
ACM Transactions on Computer Systems (TOCS)
Height reduction of control recurrences for ILP processors
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Iterative modulo scheduling: an algorithm for software pipelining loops
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Minimum register requirements for a modulo schedule
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
ACM Computing Surveys (CSUR)
Optimum modulo schedules for minimum register requirements
ICS '95 Proceedings of the 9th international conference on Supercomputing
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
Critical path reduction for scalar programs
Proceedings of the 28th annual international symposium on Microarchitecture
Modulo scheduling with multiple initiation intervals
Proceedings of the 28th annual international symposium on Microarchitecture
Region-based compilation: an introduction and motivation
Proceedings of the 28th annual international symposium on Microarchitecture
Unrolling-based optimizations for modulo scheduling
Proceedings of the 28th annual international symposium on Microarchitecture
Stage scheduling: a technique to reduce the register requirements of a modulo schedule
Proceedings of the 28th annual international symposium on Microarchitecture
Modulo scheduling of loops in control-intensive non-numeric programs
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Heuristics for register-constrained software pipelining
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Software pipelining loops with conditional branches
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Modulo Scheduling with Reduced Register Pressure
IEEE Transactions on Computers
Quantitative Evaluation of Register Pressure on Software Pipelined Loops
International Journal of Parallel Programming
Effective cluster assignment for modulo scheduling
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Control CPR: a branch height reduction optimization for EPIC architectures
Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
Lifetime-Sensitive Modulo Scheduling in a Production Environment
IEEE Transactions on Computers
Software Pipelining Irregular Loops On the TMS320C6000 VLIW DSP Architecture
OM '01 Proceedings of the 2001 ACM SIGPLAN workshop on Optimization of middleware and distributed systems
The Intel IA-64 Compiler Code Generator
IEEE Micro
Architecture and Performance of the Hitachi SR2201 Massively Parallel Processor System
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Reduced code size modulo scheduling in the absence of hardware support
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Register Constrained Modulo Scheduling
IEEE Transactions on Parallel and Distributed Systems
Hi-index | 0.01 |
Modulo scheduling theory can be applied successfully to overlap Fortran DO loops on pipelined computers issuing multiple operations per cycle both with and without special loop architectural support [1, 2, 3]. This paper shows that a broader class of loops - repeat-until, while, and loops with more than one exit - where the trip count is not known beforehand, can also be overlapped efficiently on multiple issue pipelined machines. Special features that are required in the architecture, as well as compiler representations for accelerating these loop constructs, are discussed. Performance results are presented for a few select examples. A prototype scheduler is currently under construction at HP Laboratories.