Warp architecture and implementation
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Efficient hardware for multiway jumps and pre-fetches
MICRO 18 Proceedings of the 18th annual workshop on Microprogramming
URPR—An extension of URCR for software pipelining
MICRO 19 Proceedings of the 19th annual workshop on Microprogramming
The program dependence graph and its use in optimization
ACM Transactions on Programming Languages and Systems (TOPLAS)
PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
“Combining” as a compilation technique for VLIW architectures
MICRO 22 Proceedings of the 22nd annual workshop on Microprogramming and microarchitecture
A theory of compaction-based parallelization
Proceedings of the Second European Symposium on Programming
Selected papers of the second workshop on Languages and compilers for parallel computing
Dependence flow graphs: an algebraic approach to program dependencies
POPL '91 Proceedings of the 18th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
On optimal parallelization of arbitrary loops
Journal of Parallel and Distributed Computing
A timed Petri-net model for fine-grain loop scheduling
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Register allocation for software pipelined loops
PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
An efficient resource-constrained global scheduling technique for superscalar and VLIW processors
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Enhanced region scheduling on a program dependence graph
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Enhanced modulo scheduling for loops with conditional branches
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Lifetime-sensitive modulo scheduling
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Instruction-level parallel processing: history, overview, and perspective
The Journal of Supercomputing - Special issue on instruction-level parallelism
Efficient scheduling of fine grain parallelism in loops
MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
Using a lookahead window in a compaction-based parallelizing compiler
MICRO 23 Proceedings of the 23rd annual workshop and symposium on Microprogramming and microarchitecture
A compilation technique for software pipelining of loops with conditional jumps
MICRO 20 Proceedings of the 20th annual workshop on Microprogramming
GURPR—a method for global software pipelining
MICRO 20 Proceedings of the 20th annual workshop on Microprogramming
A global resource-constrained parallelization technique
ICS '89 Proceedings of the 3rd international conference on Supercomputing
A Fortran compiler for the FPS-164 scientific computer
SIGPLAN '84 Proceedings of the 1984 SIGPLAN symposium on Compiler construction
Conversion of control dependence to data dependence
POPL '83 Proceedings of the 10th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Dependence graphs and compiler optimizations
POPL '81 Proceedings of the 8th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Perfect Pipelining: A New Loop Parallelization Technique
ESOP '88 Proceedings of the 2nd European Symposium on Programming
Register Allocation, Renaming and Their Impact on Fine-Grain Parallelism
Proceedings of the Fourth International Workshop on Languages and Compilers for Parallel Computing
MICRO 14 Proceedings of the 14th annual workshop on Microprogramming
2n-way jump microinstruction hardware and an effective instruction binding method
MICRO 13 Proceedings of the 13th annual workshop on Microprogramming
The microprogramming of pipelined processors
ISCA '77 Proceedings of the 4th annual symposium on Computer architecture
Compaction-Based Parallelization
Compaction-Based Parallelization
A systolic array optimizing compiler
A systolic array optimizing compiler
Wavesched: a novel scheduling technique for control-flow intensive behavioral descriptions
ICCAD '97 Proceedings of the 1997 IEEE/ACM international conference on Computer-aided design
Optimal Modulo Scheduling Through Enumeration
International Journal of Parallel Programming
Exploiting state equivalence on the fly while applying code motion and speculation
DATE '99 Proceedings of the conference on Design, automation and test in Europe
A reordering technique for efficient code motion
Proceedings of the 36th annual ACM/IEEE Design Automation Conference
RS-FDRA: a register sensitive software pipelining algorithm for embedded VLIW processors
Proceedings of the ninth international symposium on Hardware/software codesign
FDRA: a software-pipelining algorithm for embedded VLIW processors
ISSS '00 Proceedings of the 13th international symposium on System synthesis
Optimal software pipelining of loops with control flows
ICS '02 Proceedings of the 16th international conference on Supercomputing
Constraint analysis for DSP code generation
Readings in hardware/software co-design
CALiBeR: a software pipelining algorithm for clustered embedded VLIW processors
Proceedings of the 2001 IEEE/ACM international conference on Computer-aided design
Unroll-Based Copy Elimination for Enhanced Pipeline Scheduling
IEEE Transactions on Computers
Run-Time Support to Register Allocation for Loop Parallelization of Image Processing Programs
HPCN Europe 2000 Proceedings of the 8th International Conference on High-Performance Computing and Networking
Unroll-Based Copy Elimination for Enhanced Pipeline Scheduling
LCPC '99 Proceedings of the 12th International Workshop on Languages and Compilers for Parallel Computing
A First Step Towards Time Optimal Software Pipelining of Loops with Control Flows
CC '01 Proceedings of the 10th International Conference on Compiler Construction
Loop Shifting and Compaction for the High-Level Synthesis of Designs with Complex Control Flow
Proceedings of the conference on Design, automation and test in Europe - Volume 1
Single-Dimension Software Pipelining for Multi-Dimensional Loops
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Time optimal software pipelining of loops with control flows
International Journal of Parallel Programming
Instruction level parallelism of non-uniform acyclic loops
Journal of Computing Sciences in Colleges
Merging Head and Tail Duplication for Convergent Hyperblock Formation
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Register allocation for software pipelined multidimensional loops
ACM Transactions on Programming Languages and Systems (TOPLAS)
Hardware/software partitioning and pipelined scheduling on runtime reconfigurable FPGAs
ACM Transactions on Design Automation of Electronic Systems (TODAES)
LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing
Hi-index | 0.00 |
This paper presents a software pipelining algorithm for the automatic extraction of fine-grain parallelism in general loops. The algorithm accounts for machine resource constraints in a way that smoothly integrates the management of resource constraints with software pipelining. Furthermore, generality in the software pipelining algorithm is not sacrificed to handle resource constraints, and scheduling choices are made with truly global information. Proofs of correctness and the results of experiments with an implementation are also presented.