Split-Path Enhanced Pipeline Scheduling

Authors:
SangMin Shim;Soo-Mook Moon
Affiliations:
-;IEEE
Venue:
IEEE Transactions on Parallel and Distributed Systems
Year:
2003

Citing 21
Cited 1

Software pipelining: an effective scheduling technique for VLIW machines

PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
A new compilation technique for parallelizing loops with unpredictable branches on a VLIW architecture

Selected papers of the second workshop on Languages and compilers for parallel computing
Using profile information to assist classic code optimizations

Software—Practice & Experience
Effective compiler support for predicated execution using the hyperblock

MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Code generation schema for modulo scheduled loops

MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Enhanced modulo scheduling for loops with conditional branches

MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Lifetime-sensitive modulo scheduling

PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
The superblock: an effective technique for VLIW and superscalar compilation

The Journal of Supercomputing - Special issue on instruction-level parallelism
Iterative modulo scheduling: an algorithm for software pipelining loops

MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Software pipelining

ACM Computing Surveys (CSUR)
Modulo scheduling with multiple initiation intervals

Proceedings of the 28th annual international symposium on Microarchitecture
Stage scheduling: a technique to reduce the register requirements of a modulo schedule

Proceedings of the 28th annual international symposium on Microarchitecture
Modulo scheduling of loops in control-intensive non-numeric programs

Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Software pipelining loops with conditional branches

Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
GURPR—a method for global software pipelining

MICRO 20 Proceedings of the 20th annual workshop on Microprogramming
Parallelizing nonnumerical code with selective scheduling and software pipelining

ACM Transactions on Programming Languages and Systems (TOPLAS)
Unroll-Based Copy Elimination for Enhanced Pipeline Scheduling

IEEE Transactions on Computers
Some scheduling techniques and an easily schedulable horizontal architecture for high performance scientific computing

MICRO 14 Proceedings of the 14th annual workshop on Microprogramming
Register-Sensitive Software Pipelining

IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Swing Modulo Scheduling: A Lifetime-Sensitive Approach

PACT '96 Proceedings of the 1996 Conference on Parallel Architectures and Compilation Techniques
Bulldog: a compiler for vliw architectures (parallel computing, reduced-instruction-set, trace scheduling, scientific)

Bulldog: a compiler for vliw architectures (parallel computing, reduced-instruction-set, trace scheduling, scientific)

Probabilistic Predicate-Aware Modulo Scheduling

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization

Quantified Score

Hi-index	0.00

Visualization

Abstract

Software pipelining increases the loop execution throughput by overlapping the execution of successive iterations in a pipelined fashion. For loops with control flows, however, software pipelining is not straightforward because we need to consider the overlap of more than one execution path. Modulo scheduling simply transforms them into straightline loops through if-conversion which, in effect, achieves a fixed, worst-case initiation interval (II) among all paths. On the other hand, all-path pipelining (APP) and enhanced pipeline scheduling (EPS) can achieve a variable II depending on the path that is taken at execution time. Unfortunately, APP concentrates only on the overlap within the same path, entirely losing the overlap between different paths, whereas EPS attempts to overlap all paths together, failing to produce a tight schedule for each individual path, especially when resource constraints are tight. In this paper, we propose a new approach to EPS called split-path EPS (SP-EPS), which first splits each individual path via tail duplication and then performs EPS in a way to guarantee a tight schedule for each path, while producing a competitive cross-path schedule. We also extend SP-EPS to outer loops such that frequent paths that bypass the inner loop are split and then scheduled by SP-EPS. Our experimental results on nontrivial integer benchmarks show that SP-EPS can achieve as much as a geometric mean of 10 percent speedup over EPS when innermost loops are scheduled by SP-EPS, while it can achieve a geometric mean of 11.9 percent speedup when outer loops are also scheduled by SP-EPS.