Highly concurrent scalar processing
Highly concurrent scalar processing
URPR—An extension of URCR for software pipelining
MICRO 19 Proceedings of the 19th annual workshop on Microprogramming
Software pipelining: an effective scheduling technique for VLIW machines
PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Selected papers of the second workshop on Languages and compilers for parallel computing
Parallelization of loops with exits on pipelined architectures
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
GURPR*: a new global software pipelining algorithm
MICRO 24 Proceedings of the 24th annual international symposium on Microarchitecture
Register allocation for software pipelined loops
PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Enhanced modulo scheduling for loops with conditional branches
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
A dynamic-programming technique for compacting loops
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Lifetime-sensitive modulo scheduling
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Instruction-level parallel processing: history, overview, and perspective
The Journal of Supercomputing - Special issue on instruction-level parallelism
The Journal of Supercomputing - Special issue on instruction-level parallelism
Iterative modulo scheduling: an algorithm for software pipelining loops
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
ACM Computing Surveys (CSUR)
Modulo scheduling with multiple initiation intervals
Proceedings of the 28th annual international symposium on Microarchitecture
Hypernode reduction modulo scheduling
Proceedings of the 28th annual international symposium on Microarchitecture
GPMB—software pipelining branch-intensive loops
MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
A compilation technique for software pipelining of loops with conditional jumps
MICRO 20 Proceedings of the 20th annual workshop on Microprogramming
GURPR—a method for global software pipelining
MICRO 20 Proceedings of the 20th annual workshop on Microprogramming
Local Microcode Compaction Techniques
ACM Computing Surveys (CSUR)
Conversion of control dependence to data dependence
POPL '83 Proceedings of the 10th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Perfect Pipelining: A New Loop Parallelization Technique
ESOP '88 Proceedings of the 2nd European Symposium on Programming
MICRO 14 Proceedings of the 14th annual workshop on Microprogramming
Trace Scheduling: A Technique for Global Microcode Compaction
IEEE Transactions on Computers
Split-path enhanced pipeline scheduling for loops with control flows
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Effective cluster assignment for modulo scheduling
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
IEEE Transactions on Computers
Control Flow Regeneration for Software Pipelined Loops with Conditions
International Journal of Parallel Programming
Unroll-Based Copy Elimination for Enhanced Pipeline Scheduling
IEEE Transactions on Computers
A finite state machine based format model of software pipelined loops with conditions
Progress in computer research
Unroll-Based Copy Elimination for Enhanced Pipeline Scheduling
LCPC '99 Proceedings of the 12th International Workshop on Languages and Compilers for Parallel Computing
Global Software Pipelining with Iteration Preselection
CC '00 Proceedings of the 9th International Conference on Compiler Construction
Predicate-aware scheduling: a technique for reducing resource constraints
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Split-Path Enhanced Pipeline Scheduling
IEEE Transactions on Parallel and Distributed Systems
Predicated Software Pipelining Technique for Loops with Conditions
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Probabilistic Predicate-Aware Modulo Scheduling
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
DATE '03 Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
Complementing software pipelining with software thread integration
LCTES '05 Proceedings of the 2005 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
A new register file access architecture for software pipelining in VLIW processors
Proceedings of the 2005 Asia and South Pacific Design Automation Conference
Compile-time and instruction-set methods for improving floating- to fixed-point conversion accuracy
ACM Transactions on Embedded Computing Systems (TECS)
Hi-index | 0.01 |
Software pipelining is an aggressive scheduling technique that generates efficient code for loops and is particularly effective for VLIW architectures. Few software pipelining algorithms, however, are able to efficiently schedule loops that contain conditional branches. We have developed an algorithm we call All Paths Pipelining (APP) that addresses this shortcoming of software pipelining. APP is designed to achieve optimal or near-optimal performance for any run of iterations while providing efficient code for transitioning between runs. A run is the execution of consecutive iterations that all execute the same path through a loop. APP accomplishes this by using techniques from modulo scheduling and kernel recognition algorithms, the two main approaches for software pipelining loops. We have implemented the APP algorithm in our research compiler and have evaluated its performance by executing its generated code on a VLIW instruction-set simulator. For a processor with five heterogeneous functional units, APP is able to add another 1% to 23% increase in performance over basic software pipelining by effectively pipelining loops with conditional branches.